问题
I want to import multiple TSV Files (yes: TSV) in R.
Reading a single file with an selection of spacific columns works well by using:
data00<-read.csv(file = '/Volumes/2018/06_abteilungen/bi/analytics/tools/adobe/adobe_analytics/adobe_analytics_api_rohdaten/api_via_data_feed_auf_ftp/beispiel_datenexporte_data_feed/01sssamsung4de_20180501-000000.tsv',
sep ="\t",
fill = TRUE,
quote='',
header = FALSE
)[ ,c(287, 288, 289, 290, 291, 292, 293, 304, 370, 661, 662, 812, 813, 994, 995, 1002)]
Now i want to import multiple files and combine them to a single dataframe:
setwd('/Volumes/2018/06_abteilungen/bi/analytics/tools/adobe/adobe_analytics/adobe_analytics_api_rohdaten/api_via_data_feed_auf_ftp/beispiel_datenexporte_data_feed/import_r')
temp <- list.files(pattern="*.tsv")
test_data <- lapply(temp, read.csv,
sep ="\t",
fill = TRUE,
quote='',
header = FALSE
)[ ,c(287, 288, 289, 290, 291, 292, 293, 304, 370, 661, 662, 812, 813, 994, 995, 1002)]
Last querie gives my an exception and doesnt work: Fehler in lapply(temp, read.csv, sep = "\t", fill = TRUE, quote = "", header = FALSE)[, : falsche Anzahl von Dimensionen (translation: wrong count of dimensions)
When I take all columns, it works:
test_data <- lapply(temp, read.csv,
sep ="\t",
fill = TRUE,
quote='',
header = FALSE
)
回答1:
You are indexing the list of data frames, and not the dataframes themselves. Try:
test_data <- lapply(temp,function(x){
read.csv(file = x,
sep ="\t",
fill = TRUE,
quote='',
header = FALSE
)[ ,c(287, 288, 289, 290, 291, 292, 293, 304, 370, 661, 662, 812, 813,994, 995, 1002)]
}
)
回答2:
Hard to say without sample-data, but I believe you have to 'merge' your imported lists first:
dplyr solution:
library(dplyr)
test_data <- lapply(temp, read.csv,
sep ="\t",
fill = TRUE,
quote='',
header = FALSE
) %>%
bind_rows() %>%
select( c(287, 288, 289, 290, 291, 292, 293, 304, 370, 661, 662, 812, 813, 994, 995, 1002) )
来源:https://stackoverflow.com/questions/50833931/r-import-multiple-csv-files