for loop & if function in R

后端 未结 3 1605
刺人心
刺人心 2021-01-27 19:22

I was writing a loop with if function in R. The table is like below:

ID  category
1   a
1   b
1   c
2   a
2   b
3   a
3   b
4   a
5   a

I want

相关标签:
3条回答
  • 2021-01-27 19:36

    what you want is actually a column of factor level. do this

    df$count=as.numeric(df$category)
    

    this will give out put as

      ID category count
    1  1        a     1
    2  1        b     2
    3  1        c     3
    4  2        a     1
    5  2        b     2
    6  3        a     1
    7  3        b     2
    8  4        a     1
    9  5        a     1
    

    provided your category is already a factor. if not first convert to factor

    df$category=as.factor(df$category)
    df$count=as.numeric(df$category)
    
    0 讨论(0)
  • 2021-01-27 19:41

    looping solution will be painfully slow for bigger data. Here is one line solution using data.table:

    require(data.table)
    a<-data.table(ID=c(1,1,1,2,2,3,3,4,5),category=c('a','b','c','a','b','a','b','a','a'))
    a[,':='(category_count = 1:.N),by=.(ID)]
    
    0 讨论(0)
  • 2021-01-27 19:57

    There are packages and vectorized ways to do this task, but if you are practicing with loops try:

    output1$rn <- 1
    for (i in 2:nrow(output1)){
      if(output1[i,1] == output1[i-1,1]){
        output1[i,"rn"]<- output1[i-1,"rn"]+1
      } 
    
      else{
         output1[i,"rn"]<-1
       } 
    }
    

    With your original code, when you made this call output1[i-1,"rn"]+1 in the third line of your loop, you were referencing a row that didn't exist on the first pass. By first creating the row and filling it with the value 1, you give the loop something explicit to refer to.

    output1
    #   ID category rn
    # 1  1        a  1
    # 2  1        b  2
    # 3  1        c  3
    # 4  2        a  1
    # 5  2        b  2
    # 6  3        a  1
    # 7  3        b  2
    # 8  4        a  1
    # 9  5        a  1
    

    With the package dplyr you can accomplish it quickly with:

    library(dplyr)
    output1 %>% group_by(ID) %>% mutate(rn = 1:n())
    

    Or with data.table:

    library(data.table)
    setDT(output1)[,rn := 1:.N, by=ID]
    

    With base R you can also use:

    output1$rn <- with(output1, ave(as.character(category), ID, FUN=seq))
    

    There are vignettes and tutorials on the two packages mentioned, and by searching ?ave in the R console for the last approach.

    0 讨论(0)
提交回复
热议问题