how can I manipulate a very large list

后端 未结 1 1602
梦毁少年i
梦毁少年i 2021-01-23 22:38

I have over 10000 files. I first set my directory to the folder that the files are there.

Then I make a link to the all files with .txt format like this

相关标签:
1条回答
  • 2021-01-23 22:58

    Here is an approach. You can change column names after and make other extra cosmetic changes that you like. This is intended to get to the core of your issue, you can dress it up how you like. I wrote a helper function add_rows that takes three arguments; a data frame, number of rows to add, and what to fill them with.

    library(data.table)
    #version 1.10+
    
    #Helper function to add extra rows
    add_rows <- function(DT, n, fill='') {
      rbindlist(list(DT, data.table(myfile=rep(fill,n), Myname=rep(fill,n))))
    }
    
    #Remove first column 
    lst2 <- lapply(my.list, function(x) x[, c("myfile", "myname")]) #if using version <= 1.9.8, x[, -1, with=FALSE]
    
    #data table with most rows
    len <- max(sapply(lst2, nrow))
    
    #Add rows
    lst3 <- lapply(lst2, function(x) add_rows(x, len-nrow(x)))
    
    #Order rows
    #braces have backslashes added because without them those characters have special meaning in searches
    tofind <- c("13C\\(6\\)15N\\(4\\)", "13C\\(6\\)")
    lst4 <- lapply(lst3, function(DT) {
      pattern <- paste0(tofind, collapse="|")
      moveup <- DT[, grep(pattern, myfile)]
      myorder <- c(moveup, setdiff(1:nrow(DT), moveup))
      DT[myorder]
    })
    
    #Combine data
    newdf <- do.call('cbind', lst4)
    
    #Update names
    setnames(newdf, paste0(names(newdf), rep(1:table(names(newdf))[1], each=2)))
    
    newdf
    
    0 讨论(0)
提交回复
热议问题