问题
I have read several other posts about how to import csv files with read.csv but skipping specific columns. However, all the examples I have found had very few columns, and so it was easy to do something like:
columnHeaders <- c("column1", "column2", "column_to_skip")
columnClasses <- c("numeric", "numeric", "NULL")
data <- read.csv(fileCSV, header = FALSE, sep = ",", col.names =
columnHeaders, colClasses = columnClasses)
I have 201 columns, without column labels. I would like to skip the last column. How would it be possible to do this without naming all the other columns to keep? Many thanks.
回答1:
Bit hacky but, I usually read in a small number of the rows of the dataset I want, then use sapply(..., class)
to find the column types and set the last one to "NULL".
data<-read.table("test.csv", sep=',', nrows = 100)
colClasses<-sapply(data, class)
colClasses[length(colClasses)]<-"NULL"
Then you can pass this colClasses
to your read.csv()
function
回答2:
You can just read in all the data and then eliminate the offenders afterwards.
data <- read.csv("../CAASPP_clustering/ca2016_1_csv_v3.zip")
data_trimmed <- data[,1:(ncol(data)-1)]
If you prefer to screen the classes more programmatically then you could do something like this:
class_list <- lapply(data, class)
chosen_cols <- names(class_list[class_list != "NULL"])
data_trimmed <- data[chosen_cols]
来源:https://stackoverflow.com/questions/47313390/read-csv-and-skip-last-column-in-r