Merging data from many files and plot them

后端未结

关注

 2  1149

借酒劲吻你 2021-01-16 11:13

I have written application that is analyzing data and writing results in CSV file. It contains three columns: id, diff and count.
1. id is t

2条回答

南笙 (楼主)

2021-01-16 11:27
Edited to clean up some typos and address the multiple K value issue.

I'm going to assume that you've placed all your .csv files in a single directory (and there's nothing else in this directory). I will also assume that each .csv really do have the same structure (same number of columns, in the same order). I would begin by generating a list of the file names:
```
myCSVs <- list.files("path/to/directory")
```
Then I would 'loop' over the list of file names using lapply, reading each file into a data frame using read.csv:
```
setwd("path/to/directory")
#This function just reads in the file and
# appends a column with the K val taken from the file
# name. You may need to tinker with the particulars here.
myFun <- function(fn){
     tmp <- read.csv(fn)
     tmp$K <- strsplit(fn,".",fixed = TRUE)[[1]][1]
     tmp
}
dataList <- lapply(myCSVs, FUN = myFun,...)
```
Depending on the structure of your .csv's you may need to pass some additional arguments to read.csv. Finally, I would combine this list of data frames into a single data frame:
```
myData <- do.call(rbind, dataList)
```
Then you should have all your data in a single data frame, myData, that you can pass to ggplot.

As for the statistical aspect of your question, it's a little difficult to offer an opinion without concrete examples of your data. Once you've figured the programming part out, you could ask a separate question that provides some sample data (either here, or on stats.stackexchange.com) and folks will be able to suggest some visualization or analysis techniques that may help.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...