问题
I have a perhaps basic questions and I have searched on the web. I have a problem reading files. Though, I managed to get to read my files, following @Konrad suggestions, which I appreciate: How to get R to read in files from multiple subdirectories under one large directory?
It is a similar problem, however, I have not resolved it.
My problem:
I have large number of files of with same name ("tempo.out") in different folders. This tempo.out has 5 columns/headers. And they are all the same format with 1048 lines and 5 columns:
id X Y time temp
setwd("~/Documents/ewat")
dat.files <- list.files(path="./ress",
recursive=T,
pattern="tempo.out"
,full.names=T)
readDatFile <- function(f) {
dat.fl <- read.table(f)
}
data.filesf <- sapply(dat.files, readDatFile)
# I might not have the right sintax in sub5:
subs5 <- sapply(data.filesf,`[`,5)
matr5 <- do.call(rbind, subs5)
probs <- c(0.05,0.1,0.16,0.25,0.5,0.75,0.84,0.90,0.95,0.99)
q <- rowQuantiles(matr5, probs=probs)
print(q)
I want to extract the fifth column (temp) of each of those thousands of files and make calculations such as quantiles.
I tried first to read all subfiles in "ress"
The latter gave no error, but my main problem is the "data.filesf" is not a matrix but list, and actually the 5th column is not what I expected. Then the following:
matr5 <- do.call(rbind, subs5)
is also not giving the required values/results.
What could be the best way to get columns into what will become a huge matrix?
回答1:
Try
lapply(data.filef,[
,,5)
Hope this will help
回答2:
Consider extending your defined function, readDatFile, to extract fifth column, temp, and assign directly to matrix with sapply
or vapply
(since you know ahead the needed structure -numeric matrix length equal to nrows or 1048). Then, run needed rowQuantiles
:
setwd("~/Documents/ewat")
dat.files <- list.files(path="./ress",
recursive=T,
pattern="tempo.out",
full.names=T)
readDatFile <- function(f) read.table(f)$temp # OR USE read.csv(f)[[5]]
matr5 <- sapply(dat.files, readDatFile, USE.NAMES=FALSE)
# matr5 <- vapply(dat.files, readDatFile, numeric(1048), USE.NAMES=FALSE)
probs <- c(0.05,0.1,0.16,0.25,0.5,0.75,0.84,0.90,0.95,0.99)
q <- rowQuantiles(matr5, probs=probs)
来源:https://stackoverflow.com/questions/45888935/lists-and-matrix-using-sapply