Specifying colClasses in the read.csv

前端 未结 7 1125
独厮守ぢ
独厮守ぢ 2020-11-27 10:37

I am trying to specify the colClasses options in the read.csv function in R. In my data, the first column \"time\" is basically a character vector

相关标签:
7条回答
  • 2020-11-27 11:08

    You can specify the colClasse for only one columns.

    So in your example you should use:

    data <- read.csv('test.csv', colClasses=c("time"="character"))
    
    0 讨论(0)
  • 2020-11-27 11:11

    If we combine what @Hendy and @Oddysseus Ithaca contributed, we get cleaner and a more general (i.e., adaptable?) chunk of code.

        data <- read.csv("test.csv", head = F, colClasses = c(V36 = "character", V38 = "character"))                        
    
    0 讨论(0)
  • 2020-11-27 11:14

    For multiple datetime columns with no header, and a lot of columns, say my datetime fields are in columns 36 and 38, and I want them read in as character fields:

    data<-read.csv("test.csv", head=FALSE,   colClasses=c("V36"="character","V38"="character"))                        
    
    0 讨论(0)
  • 2020-11-27 11:15

    I know OP asked about the utils::read.csv function, but let me provide an answer for these that come here searching how to do it using readr::read_csv from the tidyverse.

    read_csv ("test.csv", col_names=FALSE, col_types = cols (.default = "c", time = "i"))
    

    This should set the default type for all columns as character, while time would be parsed as integer.

    0 讨论(0)
  • 2020-11-27 11:18

    Assuming your 'time' column has at least one observation with a non-numeric character and all your other columns only have numbers, then 'read.csv's default will be to read in 'time' as a 'factor' and all the rest of the columns as 'numeric'. Therefore setting 'stringsAsFactors=F' will have the same result as setting the 'colClasses' manually i.e.,

    data <- read.csv('test.csv', stringsAsFactors=F)
    
    0 讨论(0)
  • 2020-11-27 11:19

    If you want to refer to names from the header rather than column numbers, you can use something like this:

    fname <- "test.csv"
    headset <- read.csv(fname, header = TRUE, nrows = 10)
    classes <- sapply(headset, class)
    classes[names(classes) %in% c("time")] <- "character"
    dataset <- read.csv(fname, header = TRUE, colClasses = classes)
    
    0 讨论(0)
提交回复
热议问题