I\'m new to R and I\'m trying to load 100 or so txt files with three columns Name, Frequency and Gender into a single data frame. The files are all name \"yob1990.txt\" etc.
I would use a workflow something like this, which assumes (1) that the only .txt
files in the specified path are the ones you want read in, and (2) that the only numerals in the filenames are the digits of the years.
f <- list.files('path/to/files', patt='\\.txt$', full.names=TRUE)
# replace path above as required
d <- do.call(rbind, lapply(f, function(x) {
d <- read.table(x, header=TRUE) # add sep argument as required
d$Year <- as.numeric(gsub('\\D', '', basename(x)))
d
}))
f
will be a vector of full paths to the files you need to read in.
lapply
considers each filename in turn (each element of f
), temporarily refers to that filename as x
, and performs everything in between the curly braces.
gsub('\\D', '', basename(x))
performs a "find and replace"-type operation on basename(x)
(which is the filename of the currently considered file, excluding the structure of the directory containing the file). We look for all non-digit characters ('\\D'
), and replace them with nothing (''
). We add the result of this gsub
operation (which is the year, assuming no other digits lurk in the filename) to a new Year
column of the data.frame.
Finally, we return d
, and once lapply
has performed this procedure on all files in f
, we row bind them all together with do.call(rbind, ...)
.