Load multiple txt files to a single data frame and retain name as a column in R

前端 未结 2 631
时光说笑
时光说笑 2021-01-28 16:24

I\'m new to R and I\'m trying to load 100 or so txt files with three columns Name, Frequency and Gender into a single data frame. The files are all name \"yob1990.txt\" etc.

2条回答
  •  无人共我
    2021-01-28 17:09

    I would use a workflow something like this, which assumes (1) that the only .txt files in the specified path are the ones you want read in, and (2) that the only numerals in the filenames are the digits of the years.

    f <- list.files('path/to/files', patt='\\.txt$', full.names=TRUE) 
    # replace path above as required
    d <- do.call(rbind, lapply(f, function(x) {
      d <- read.table(x, header=TRUE) # add sep argument as required
      d$Year <- as.numeric(gsub('\\D', '', basename(x)))
      d
    }))
    

    f will be a vector of full paths to the files you need to read in.

    lapply considers each filename in turn (each element of f), temporarily refers to that filename as x, and performs everything in between the curly braces.

    gsub('\\D', '', basename(x)) performs a "find and replace"-type operation on basename(x) (which is the filename of the currently considered file, excluding the structure of the directory containing the file). We look for all non-digit characters ('\\D'), and replace them with nothing (''). We add the result of this gsub operation (which is the year, assuming no other digits lurk in the filename) to a new Year column of the data.frame.

    Finally, we return d, and once lapply has performed this procedure on all files in f, we row bind them all together with do.call(rbind, ...).

提交回复
热议问题