subsetting data frame by row index

前端 未结 2 1564
-上瘾入骨i
-上瘾入骨i 2021-01-18 19:13

Why is my last step converting the data frame to a vector? I want to keep the first 6000 observations in the data frame key.

  set.seed(1)
  key         


        
相关标签:
2条回答
  • 2021-01-18 19:45

    It's being coerced to a vector basically because it can be and that's the default coercion when there's only 1 element. R is trying to be "helpful".

    This will keep it as a dataframe:

    set.seed(1)
    key <- data.frame(matrix(NA, nrow = 10000, ncol = 1))
    names(key) <- "ID"
    key$ID <- replicate(10000, 
                          rawToChar(as.raw(sample(c(48:57,65:90,97:122), 8, replace=T))))
    key <- unique(key)  
    key <- as.data.frame(key[1:6000,]) # still a data frame
    
    0 讨论(0)
  • 2021-01-18 20:05
     key1 <- key[1:6000,,drop=F] #should prevent the data.frame from converting to a vector.
    

    According to the documentation of ?Extract.data.frame

    drop: logical. If ‘TRUE’ the result is coerced to the lowest possible dimension. The default is to drop if only one column is left, but not to drop if only one row is left.

    Or, you could use subset, but usually, this is a bit slower. Here the row.names are numbers from 1 to 10000

     key2 <- subset(key, as.numeric(rownames(key)) <6000)
    
     is.data.frame(key2)
     #[1] TRUE
    

    because,

     ## S3 method for class 'data.frame'
     subset(x, subset, select, drop = FALSE, ...) #by default it uses drop=F
    
    0 讨论(0)
提交回复
热议问题