Levels in R Dataframe

后端 未结 4 1269
囚心锁ツ
囚心锁ツ 2021-01-18 06:00

I imported data from a .csv file, and attached the dataset.
My problem: one variable is in integer form and has 295 levels. I need to use this variable to create others

相关标签:
4条回答
  • 2021-01-18 06:19

    When you read in the data with read.table (or read.csv? - you didn't specify), add the argument stringsAsFactors = FALSE. Then you will get character data instead.

    If you are expecting integers for the column then you must have data that is not interpretable as integers, so convert to numeric after you've read it.

    txt <- c("x,y,z", "1,2,3", "a,b,c")
    
    d <- read.csv(textConnection(txt))
    sapply(d, class)
           x        y        z 
    ##"factor" "factor" "factor" 
    
    ## we don't want factors, but characters
    d <- read.csv(textConnection(txt), stringsAsFactors = FALSE)
    sapply(d, class)
    
    #          x           y           z 
    #"character" "character" "character" 
    
    ## convert x to numeric, and wear NAs for non numeric data
    as.numeric(d$x)
    
    #[1]  1 NA
    #Warning message:
    #NAs introduced by coercion 
    

    Finally, if you want to ignore these input details and extract the integer levels from the factor use e.g. as.numeric(levels(d$x))[d$x], as per "Warning" in ?factor.

    0 讨论(0)
  • 2021-01-18 06:21

    or you can simply use

    d$x2 = as.numeric(as.character(d$x)).

    0 讨论(0)
  • 2021-01-18 06:22

    Do summary(data) to check things got read in properly. If columns aren't numeric that should be, look at the colClasses argument to read.csv to force it, which will probably also result in NA values for poorly-formed numbers.

    help(read.csv) will help.

    0 讨论(0)
  • 2021-01-18 06:30

    Working from your clarification I suggest you redo your read statement with read.table and header=TRUE, stringAsFactors=FALSE and as.is = !stringsAsFactors and sep=",":

    datinp <- read.table("Rdata.csv", header=TRUE, stringAsFactors=FALSE , 
                           as.is = !stringsAsFactors , sep=",") 
    datinp$a <- as.numeric(datinp$a)
    datinp$b <- as.numeric(datinp$b)
    datinp$ctr <- with(datinp, as.integer(a/b) ) # no loop needed when using vector arithmetic
    
    0 讨论(0)
提交回复
热议问题