Printing a column name inside lapply function

前端 未结 2 1546
深忆病人
深忆病人 2021-01-25 02:40

I have searched through the archives but have not found a suitable answer. I am a beginner and please excuse my ignorance if I am posing a very elementary query. I am trying to

相关标签:
2条回答
  • 2021-01-25 03:29

    Ok, this is old, but I came across the same problem and wanted to share my approach, although it violates to some extent the *apply idea. The upside is: you can integrate anything in the loop. So I needed to run an ANOVA on 2 output variables, depending on columns I looped through with lapply, get the p_values to annotate the plot and create multiple plots side-by-side. The core is that it combines a for-loop with lapply

    for (i in 0:10){
    i<-i+1
    lapply(df[i],function(x) {
      myfactor<-names(df)[i] #gets the column name
      anova_model_a<-lm(a~x,df) #needed to run ANOVA per column
      anova_model_b<-lm(b~x,df) #needed to run ANOVA per column
      tab_aov_a<-tidy(summary(anova_model_a)) #proper result table
      tab_aov_b<-tidy(summary(anova_model_b)) #proper result table
      labels_a <- data.frame(drv = "1", label=c(round(tab_aov_a$p.value[2],4))) #needed for labelling the graph. I only had 2 groups for comparison
      labels_b <- data.frame(drv = "1", label=c(round(tab_aov_b$p.value[2],4))) #needed for labelling the graph
      fig1<-ggplot(df,aes(x,a))+
        geom_boxplot()+
        ggtitle("a")+
        geom_text(data=labels_a,aes(x=drv,y=12,label=label),colour="blue",angle=0,hjust=0.5, vjust=0.5,size=5)+
        xlab(myfactor)
    
      fig2<-ggplot(df,aes(x,b))+
        geom_boxplot()+
        ggtitle("b")+
        geom_text(data=labels_b,aes(x=drv,y=6,label=label),colour="blue",angle=0,hjust=0.5, vjust=0.5,size=5)+
        xlab(myfactor)
      arrangement<-grid.arrange(fig1,fig2,nrow=2)
      print(arrangement)
    })
    }
    
    0 讨论(0)
  • 2021-01-25 03:30

    Assuming that the CrossTable() function is contained in the descr package, it seems that the argument to dnn gives the row and column names in the crosstabulation. The trick is to get lapply to read both the names and the data. names(mydata)[2:4] gives the names; mydata[, 2:4] is the data. The syntax for lapply is:

    lapply(x, FUN, ...)
    

    FUN is applied to each element of x, and ... allows optional arguments to be passed to FUN. Thus, both names(mydata)[2:4] and mydata[, 2:4] can be passed FUN.

    mydata<-data.frame(matrix(rep(c(1:2),times= 50),20,5))
    colnames(mydata)<-letters[1:5]
    
    library(descr)
    
    lapply(names(mydata)[2:4], 
       function(dfNames, dfData) {
          return(CrossTable(dfData[[dfNames]], mydata[,5], dnn = c(dfNames, "mydata[,5]")))
    }, mydata[, 2:4] )
    

    The function operates on each element in names(mydata)[2:4], and the data file is passed as an additional parameter. This way, the relevant column (dfData[[dfNames]]) and the name of the relevant column (dfName) are available to CrossTable.

    [[1]]
       Cell Contents 
    |-------------------------|
    |                       N | 
    | Chi-square contribution | 
    |           N / Row Total | 
    |           N / Col Total | 
    |         N / Table Total | 
    |-------------------------|
    
    ===============================
              mydata[,5]
    b             1       2   Total
    -------------------------------
    1            10       0      10
              5.000   5.000        
              1.000   0.000   0.500
              1.000   0.000        
              0.500   0.000        
    -------------------------------
    2             0      10      10
              5.000   5.000        
              0.000   1.000   0.500
              0.000   1.000        
              0.000   0.500        
    -------------------------------
    Total        10      10      20
              0.500   0.500
    ===============================
    
    [[2]]
       Cell Contents 
    |-------------------------|
    |                       N | 
    | Chi-square contribution | 
    |           N / Row Total | 
    |           N / Col Total | 
    |         N / Table Total | 
    |-------------------------|
    
    ===============================
              mydata[,5]
    c             1       2   Total
    -------------------------------
    1            10       0      10
              5.000   5.000        
              1.000   0.000   0.500
              1.000   0.000        
              0.500   0.000        
    -------------------------------
    2             0      10      10
              5.000   5.000        
              0.000   1.000   0.500
              0.000   1.000        
              0.000   0.500        
    -------------------------------
    Total        10      10      20
              0.500   0.500
    ===============================
    
    [[3]]
       Cell Contents 
    |-------------------------|
    |                       N | 
    | Chi-square contribution | 
    |           N / Row Total | 
    |           N / Col Total | 
    |         N / Table Total | 
    |-------------------------|
    
    ===============================
              mydata[,5]
    d             1       2   Total
    -------------------------------
    1            10       0      10
              5.000   5.000        
              1.000   0.000   0.500
              1.000   0.000        
              0.500   0.000        
    -------------------------------
    2             0      10      10
              5.000   5.000        
              0.000   1.000   0.500
              0.000   1.000        
              0.000   0.500        
    -------------------------------
    Total        10      10      20
              0.500   0.500
    ===============================
    
    0 讨论(0)
提交回复
热议问题