问题
I want to calculate the mean of column of all the csv in one directory, but when I run the function it give me the error of
"Error in numeric(nc) : invalid 'length' argument".
I believe that CSV files have n/a value but it shouldn't affect the calculate the number of column?
pollutantmean <- function(directory, pollutant, id =1:332, removeNA = TRUE){
nc <- ncol(pollutant)
means <- numeric(nc)
for(i in 1:nc){
means[i] <- mean(pollutant[, i], na.rm = removeNA)
}
means
}
So here is my update version. I set R to read all the .csv into one file by using "lapply". All these csv files have the consistent name from 001 to 1xxx etc. So I set up the id from 001 to whenever.
files <- list.files(pattern = ".csv")
directory <- lapply(files, read.csv)
pollutantmean <- function(directory, pollutant, id =1:332, removeNA = TRUE){
nc <- ncol(pollutant)
means <- numeric(nc, na.rm=removeNA)
for(i in 1:nc){
means[i] <- mean(pollutant[, i], na.rm = removeNA)
}
means
}
I tried to calcuate the mean values of pollutant accross the whole directory with all the csv in one files. I intend to remove all the missing values by using "na.rm = removeNA". But it gives me error of Error in numeric(nc, na.rm = removeNA) : unused argument (na.rm = removeNA)
回答1:
pollutantmean <- function(directory, pollutant, id = 1:332) {
files_list <- list.files(directory, full.names = TRUE) #creats list of files and the csv files are sitting in the directory
dat <- data.frame() #creates empty data frame
for(i in id){
dat<- rbind(dat,read.csv(files_list[i])) #combin all the csv data together
}
good <- complete.cases(dat) #remove all the NA values from csv data set
mean(dat[good, pollutant], na.rm = TRUE) # finally calculate mean
}
Here is my answer
来源:https://stackoverflow.com/questions/38159024/invalid-length-argument-error