I'm trying to create separate .csv files for each group in a data frame grouped with dplyr's group_by function. So far I have something like
by_cyl <- group_by(mtcars, cyl)
do(by_cyl, write_csv(., "test.csv"))
As expected, this writes a single .csv file with only the data from the last group. How can I modify this to write multiple .csv files, each with filenames that include cyl?
You can wrap the csv write process in a custom function as follows. Note that the function has to return
a data.frame
else it returns an error Error: Results are not data frames at positions
This will return 3 csv files named "mtcars_cyl_4.csv","mtcars_cyl_6.csv" and "mtcars_cyl_8.csv"
customFun = function(DF) {
write.csv(DF,paste0("mtcars_cyl_",unique(DF$cyl),".csv"))
return(DF)
}
mtcars %>%
group_by(cyl) %>%
do(customFun(.))
The following works (you can skip the custom function)
library(dplyr)
library(readr)
group_by(mtcars, cyl) %>%
do(write_csv(., paste0(unique(.$cyl), "test.csv")))
With dplyr_0.8.0
this can be done with group_by_walk
library(dplyr)
library(readr)
mtcars %>%
group_by(cyl) %>%
group_walk(~ write_csv(.x, paste0(.y$cyl, "test.csv")))
If you were willing to use data.table there is a slightly less clunky way of doing it.
require(data.table)
# Because this is a built in table we have to make a copy first
mtcars <- mtcars
setDT(mtcars) # convert the data into a data.table
mtcars[, write.csv(.SD, paste0("mtcars_cyl_", .BY, ".csv")), by = cyl]
Note that the resulting table will not have a column for cyl (which would be redundant since it is stored in the file name, but maybe you want to leave it in for other reasons).
If you want cyl to be included in the output as a column you can use
mtcars[, write.csv(c(.BY,.SD), paste0("mtcars_cyl_", .BY, ".csv")), by=cyl]
来源:https://stackoverflow.com/questions/41233173/how-can-i-write-dplyr-groups-to-separate-files