Issue with pasting 5 columns groups in R

后端 未结 2 1092
梦如初夏
梦如初夏 2021-01-24 00:28

I have a data table like below. All columns are in characters.

Table:

V29  V30  V31  V32  V33  V34 V35 V36 V37 V38 .... V69
044  N    005  E    026  044 N   006         


        
2条回答
  •  孤独总比滥情好
    2021-01-24 00:57

    Expanding your data a little bit:

    x <- read.table(stringsAsFactors=FALSE, header=TRUE, as.is=TRUE, colClasses="character", text="
    V29  V30  V31  V32  V33  V34 V35 V36 V37 V38    V29a V30a V31a V32a V33a V34a V35a V36a V37a V38a
    044  N    005  E    026  044 N   006 E   011    044  N    005  E    026  044  N    006  E    011 
    044  N    005  E    026  044 N   006 E   011    044  N    005  E    026  044  N    006  E    011 ")
    

    The answer:

    sapply(split.default(x, (seq_len(ncol(x))-1) %/% 5),
           function(s) paste(apply(s, 1, paste0, collapse = ""), collapse = "-"))
    #                         0                         1                         2 
    # "044N005E026-044N005E026" "044N006E011-044N006E011" "044N005E026-044N005E026" 
    #                         3 
    # "044N006E011-044N006E011" 
    

    This can easily be assigned to a column of the same frame.

    Explanation:

    • to break a frame up by 5 columns, split comes to mind, but the default use of split(...) will use split.data.frame which splits by row, not column, so we use split.default (which works by column). From there, you can see how we're grouping things:

      (seq_len(ncol(x))-1) %/% 5
      #  [1] 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
      
    • For each of these groups, we get a 5-column frame:

      split.default(x, (seq_len(ncol(x))-1) %/% 5)
      # $`0`
      #   V29 V30 V31 V32 V33
      # 1  44   N   5   E  26
      # 2  44   N   5   E  26
      # $`1`
      #   V34 V35 V36 V37 V38
      # 1  44   N   6   E  11
      # 2  44   N   6   E  11
      ### truncated for brevity
      

      So we use sapply to do something to each of these frames, returning it (in this case) simplified. (If we specify simplify=FALSE or if not all of them are the same length, then it will be returned unsimplified, as a list instead of a vector).

    • The function we apply to each frame is apply(., 1, paste0, collapse0) which will return a vector of the 5-column pastes, something like:

      apply(s, 1, paste0, collapse = "")
      # $`0`
      # [1] ""044N005E026" "044N005E026""
      

      Because we want them combined, we surround it as paste(apply(...), collapse = "-").

提交回复
热议问题