Getting column name which holds a max value within a row of a matrix holding a separate max value within an array

后端 未结 3 1920
遥遥无期
遥遥无期 2021-02-13 19:04

For instance given:

dim1 <- c("P","PO","C","T")
dim2 <- c("LL","RR","R","Y")         


        
相关标签:
3条回答
  • 2021-02-13 19:43

    Here's a simple way to solve:

      mxCol=function(df, colIni, colFim){ #201609
      if(missing(colIni)) colIni=1
      if(missing(colFim)) colFim=ncol(df)
      if(colIni>=colFim) { print('colIni>=ColFim'); return(NULL)}
      dfm=cbind(mxC=apply(df[colIni:colFim], 1, function(x) colnames(df)[which.max(x)+(colIni-1)])
               ,df)
      dfm=cbind(mxVal=as.numeric(apply(dfm,1,function(x) x[x[1]]))
               ,dfm)
      returndfm
    }
    
    0 讨论(0)
  • 2021-02-13 19:46

    This should do it - if I understand correctly:

    Q <- array(1:48, c(4,4,3), dimnames=list(
      c("P","PO","C","T"), c("LL","RR","R","Y"), c("Jerry1", "Jerry2", "Jerry3")))
    
    column_ref <- names(which.max(Q[3,1:3, which.max(Q[3,4,])]))[1] # "R"
    

    Some explanation:

    which.max(Q[3,4,]) # return the index of the "Jerry3" slice (3)
    which.max(Q[3,1:3, 3]) # returns the index of the "R" column (3)
    

    ...and then names returns the name of the index ("R").

    0 讨论(0)
  • 2021-02-13 19:49

    This post helped me to solve a data.frame general problem.
    I have repeated measures for groups, G1 e G2.

    > str(df)
    'data.frame':   6 obs. of  15 variables:
    $ G1       : num  0 0 2 2 8 8
    $ G2       : logi  FALSE TRUE FALSE TRUE FALSE TRUE
    $ e.10.100 : num  26.41 -11.71 27.78 3.17 26.07 ...
    $ e.10.250 : num  27.27 -12.79 29.16 3.19 26.91 ...
    $ e.20.100 : num  29.96 -12.19 26.19 3.44 27.32 ...
    $ e.20.100d: num  26.42 -13.16 28.26 4.18 25.43 ...
    $ e.20.200 : num  24.244 -18.364 29.047 0.553 25.851 ...
    $ e.20.50  : num  26.55 -13.28 29.65 4.34 27.26 ...
    $ e.20.500 : num  27.94 -13.92 27.59 2.47 25.54 ...
    $ e.20.500d: num  24.4 -15.63 26.78 4.86 25.39 ...
    $ e.30.100d: num  26.543 -15.698 31.849 0.572 29.484 ...
    $ e.30.250 : num  26.776 -16.532 28.961 0.813 25.407 ...
    $ e.50.100 : num  25.995 -14.249 28.697 0.803 27.852 ...
    $ e.50.100d: num  26.1 -12.7 27.1 2.5 27.4 ...
    $ e.50.500 : num  28.78 -9.39 25.77 2.73 23.73 ..
    

    I need to know which measure (column) has the best (max) result. And I need to disconsider grouping columns.
    I ended up with this function

    apply(df[colIni:colFim], 1, function(x) colnames(df)[which.max(x)+(colIni-1)] 
    #colIni: first column to consider; colFim: last column to consider
    

    After having column name, another tiny function to get the max value

    apply(dfm,1,function(x) x[x[1]])
    

    And the function to solve similar problems, that return the column and the max value

    mxCol=function(df, colIni, colFim){ #201609
      if(missing(colIni)) colIni=1
      if(missing(colFim)) colFim=ncol(df)
      if(colIni>=colFim) { print('colIni>=ColFim'); return(NULL)}
      dfm=cbind(mxCol=apply(df[colIni:colFim], 1, function(x) colnames(df)[which.max(x)+(colIni-1)])
               ,df)
      dfm=cbind(mxVal=as.numeric(apply(dfm,1,function(x) x[x[1]]))
               ,dfm)
      return(dfm)
    }
    

    In this case,

    > mxCol(df,3)[1:11]
       mxVal     mxCol G1    G2 e.10.100 e.10.250 e.20.100 e.20.100d e.20.200 e.20.50 e.20.500
    1 29.958  e.20.100  0 FALSE   26.408   27.268   29.958    26.418   24.244  26.553   27.942
    2 -9.395  e.50.500  0  TRUE  -11.708  -12.789  -12.189   -13.162  -18.364 -13.284  -13.923
    3 31.849 e.30.100d  2 FALSE   27.782   29.158   26.190    28.257   29.047  29.650   27.586
    4  4.862 e.20.500d  2  TRUE    3.175    3.190    3.439     4.182    0.553   4.337    2.467
    5 29.484 e.30.100d  8 FALSE   26.069   26.909   27.319    25.430   25.851  27.262   25.535
    6 -9.962  e.30.250  8  TRUE  -11.362  -12.432  -15.960   -11.760  -12.832 -12.771  -12.810
    
    0 讨论(0)
提交回复
热议问题