Create an ID (row number) column

后端 未结 8 1959
我在风中等你
我在风中等你 2020-11-29 02:06

I need to create a column with unique ID, basically add the row number as an own column. My current data frame looks like this:

   V1  V2
1  23  45
2  45  4         


        
相关标签:
8条回答
  • 2020-11-29 02:22

    You could also do this using dplyr:

    DF <- mutate(DF, id = rownames(DF))
    
    0 讨论(0)
  • 2020-11-29 02:24

    Many presented their ideas, but I think this is the sortest and simplest code for this task:

    data$ID <- 1:nrow(data)
    

    One line. The one and only.

    0 讨论(0)
  • 2020-11-29 02:26

    Hope this will help. Shortest and best way to create ID column is:

    dataframe$ID <- seq.int(nrow(dataframe))
    
    0 讨论(0)
  • 2020-11-29 02:28

    You could use cbind:

    d <- data.frame(V1=c(23, 45, 56), V2=c(45, 45, 67))
    
    ## enter id here, you could also use 1:nrow(d) instead of rownames
    id <- rownames(d)
    d <- cbind(id=id, d)
    
    ## set colnames to OP's wishes
    colnames(d) <- paste0("V", 1:ncol(d))
    

    EDIT: Here a comparison of @dacko suggestions. d$id <- seq_len(nrow(d) is slightly faster, but the order of the columns is different (id is the last column; reorder them seems to be slower than using cbind):

    library("microbenchmark")
    
    set.seed(1)
    d <- data.frame(V1=rnorm(1e6), V2=rnorm(1e6))
    
    cbindSeqLen <- function(x) {
      return(cbind(id=seq_len(nrow(x)), x))
    }
    
    dickoa <- function(x) {
      x$id <- seq_len(nrow(x))
      return(x)
    }
    
    dickoaReorder <- function(x) {
      x$id <- seq_len(nrow(x))
      nc <- ncol(x)
      x <- x[, c(nc, 1:(nc-1))]
      return(x)
    }
    
    microbenchmark(cbindSeqLen(d), dickoa(d), dickoaReorder(d), times=100)
    
    # Unit: milliseconds
    #             expr      min       lq   median       uq      max neval
    #   cbindSeqLen(d) 23.00683 38.54196 40.24093 42.60020 47.73816   100
    #        dickoa(d) 10.70718 36.12495 37.58526 40.22163 72.92796   100
    # dickoaReorder(d) 19.25399 68.46162 72.45006 76.51468 88.99620   100
    
    0 讨论(0)
  • 2020-11-29 02:28

    Here is a solution that keeps the dplyr piping format and places id in the first column, which may be preferred.

    d %>% 
      mutate(id = rownames(.)) %>% 
      select(id, everything())
    
    0 讨论(0)
  • 2020-11-29 02:32

    Two tidyverse alternatives (using sgibb's example data):

    tibble::rowid_to_column(d, "ID")
    

    which gives:

      ID V1 V2
    1  1 23 45
    2  2 45 45
    3  3 56 67
    

    Or:

    dplyr::mutate(d, ID = row_number())
    

    which gives:

      V1 V2 ID
    1 23 45  1
    2 45 45  2
    3 56 67  3
    

    As you can see, the rowid_to_column-function adds the new column in front of the other ones while the mutate&row_number()-combo adds the new column after the others.


    And another base R alternative:

    d$ID <- seq_along(d[,1])
    
    0 讨论(0)
提交回复
热议问题