Running count based on field in R

后端 未结 3 1868
长情又很酷
长情又很酷 2021-02-15 13:33

I have a data set of this format

User       
1 
2
3
2
3
1  
1      

Now I want to add a column saying count which counts the occurrence of the

相关标签:
3条回答
  • 2021-02-15 14:04

    You can use getanID from my "splitstackshape" package:

    library(splitstackshape)
    getanID(mydf, "User")
    ##    User .id
    ## 1:    1   1
    ## 2:    2   1
    ## 3:    3   1
    ## 4:    2   2
    ## 5:    3   2
    ## 6:    1   2
    ## 7:    1   3
    

    This is essentially an approach with "data.table" that looks something like the following:

    as.data.table(mydf)[, count := seq(.N), by = "User"][]
    
    0 讨论(0)
  • 2021-02-15 14:08

    An option using dplyr

     library(dplyr)
     df1 %>%
          group_by(User) %>%
          mutate(Count=row_number())
     #    User Count
     #1    1     1
     #2    2     1
     #3    3     1
     #4    2     2
     #5    3     2
     #6    1     2
     #7    1     3
    

    Using sqldf

    library(sqldf)
    sqldf('select a.*, 
               count(*) as Count
               from df1 a, df1 b
               where a.User = b.User and b.rowid <= a.rowid
               group by a.rowid')
    #   User Count
    #1    1     1
    #2    2     1
    #3    3     1
    #4    2     2
    #5    3     2
    #6    1     2
    #7    1     3
    
    0 讨论(0)
  • 2021-02-15 14:17

    This is fairly easy with ave and seq.int:

    > ave(User,User, FUN= seq.int)
    [1] 1 1 1 2 2 2 3
    

    This is a common strategy and is often used when the items are adjacent to each other. The second argument is the grouping variable and in this case the first argument is really kind of a dummy argument since the only thing that it contributes is a length, and it is not a requirement for ave to have adjacent rows for the values determined within groupings.

    0 讨论(0)
提交回复
热议问题