Count occurrences of value in a set of variables in R (per row)

后端 未结 5 1707
-上瘾入骨i
-上瘾入骨i 2021-01-06 01:16

Let\'s say I have a data frame with 10 numeric variables V1-V10 (columns) and multiple rows (cases).

What I would like R to do is: For each case, give me the number

相关标签:
5条回答
  • 2021-01-06 01:27

    In my effort to find something similar to Count from SPSS in R is as follows:

    `df <- data.frame(a=c(1,1,NA,2,3,9),b=c(1,2,3,2,NA,1))` #Dummy data with NAs 
    
    `df %>% 
      dplyr::mutate(count = rowSums( #this allows calculate sum across rows
        dplyr::select(., #Slicing on .  
                      dplyr::one_of( #within select use one_of by clarifying which columns your want
                        c('a','b'))), na.rm = T)) #once the columns are specified, that's all you need, na.rm is cherry on top
    

    That's how the output looks like

    > df a b count 1 1 1 2 2 1 2 3 3 NA 3 3 4 2 2 4 5 3 NA 3 6 9 1 10

    Hope it helps :-)

    0 讨论(0)
  • 2021-01-06 01:34

    Here is another straightforward solution that comes closest to what the COUNT command in SPSS does — creating a new variable that, for each case (i.e., row) counts the occurrences of a given value or list of values across a list of variables.

    #Let df be a data frame with four variables (V1-V4)
    df <- data.frame(V1=c(1,1,2,1,NA),V2=c(1,NA,2,2,NA),
           V3=c(1,2,2,1,NA), V4=c(NA, NA, 1,2, NA))
    
     #This is how to compute a new variable counting occurences of value "1" in V1-V4.      
        df$count.1 <- apply(df, 1, function(x) length(which(x==1)))
    

    The updated data frame contains the new variable count.1 exactly as the SPSS COUNT command would do.

     > df
          V1 V2 V3 V4 count.1
        1  1  1  1 NA       3
        2  1 NA  2 NA       1
        3  2  2  2  1       1
        4  1  2  1  2       2
        5 NA NA NA NA       0
    

    You can do the same to count how many time the value "2" occurs per row in V1-V4. Note that you need to select the columns (variables) in df to which the function is applied.

    df$count.2 <- apply(df[1:4], 1, function(x) length(which(x==2)))
    

    You can also apply a similar logic to count the number of missing values in V1-V4.

    df$count.na <- apply(df[1:4], 1, function(x) sum(is.na(x)))
    

    The final result should be exactly what you wanted:

     > df
          V1 V2 V3 V4 count.1 count.2 count.na
        1  1  1  1 NA       3       0        1
        2  1 NA  2 NA       1       1        2
        3  2  2  2  1       1       3        0
        4  1  2  1  2       2       2        0
        5 NA NA NA NA       0       0        4
    

    This solution can easily be generalized to a range of values. Suppose we want to count how many times a value of 1 or 2 occurs in V1-V4 per row:

    df$count.1or2 <- apply(df[1:4], 1, function(x) sum(x %in% c(1,2)))
    
    0 讨论(0)
  • 2021-01-06 01:43

    Try

    apply(df,MARGIN=1,table)
    

    Where df is your data.frame. This will return a list of the same length of the amount of rows in your data.frame. Each item of the list corresponds to a row of the data.frame (in the same order), and it is a table where the content is the number of occurrences and the names are the corresponding values.

    For instance:

    df=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))
    #create a data.frame containing some data
    df #show the data.frame
      V1 V2 V3
    1 10 20 20
    2 20 30 10
    3 10 20 20
    4 20 30 10
    apply(df,MARGIN=1,table) #apply the function table on each row (MARGIN=1)
    [[1]]
    
    10 20 
     1  2 
    
    [[2]]
    
    10 20 30 
     1  1  1 
    
    [[3]]
    
    10 20 
     1  2 
    
    [[4]]
    
    10 20 30 
     1  1  1 
    
    #desired result
    
    0 讨论(0)
  • 2021-01-06 01:47

    If you need to count any particular word/letter in the row.

    #Let df be a data frame with four variables (V1-V4)
                 df <- data.frame(V1=c(1,1,2,1,L),V2=c(1,L,2,2,L),
                 V3=c(1,2,2,1,L), V4=c(L, L, 1,2, L))
    

    For counting number of L in each row just use

    #This is how to compute a new variable counting occurences of "L" in V1-V4.      
    df$count.L <- apply(df, 1, function(x) length(which(x=="L")))
    

    The result will appear like this

    > df
      V1 V2 V3 V4 count.L
    1  1  1  1 L       1
    2  1  L  2 L       2
    3  2  2  2  1      0
    4  1  2  1  2      0
    
    0 讨论(0)
  • 2021-01-06 01:48

    I think that there ought to be a simpler way to do this, but the best way that I can think of to get a table of counts is to loop (implicitly using sapply) over the unique values in the dataframe.

    #Some example data
    df <- data.frame(a=c(1,1,2,2,3,9),b=c(1,2,3,2,3,1))
    df
    #  a b
    #1 1 1
    #2 1 2
    #3 2 3
    #4 2 2
    #5 3 3
    #6 9 1
    
    levels=unique(do.call(c,df)) #all unique values in df
    out <- sapply(levels,function(x)rowSums(df==x)) #count occurrences of x in each row
    colnames(out) <- levels
    out
    #     1 2 3 9
    #[1,] 2 0 0 0
    #[2,] 1 1 0 0
    #[3,] 0 1 1 0
    #[4,] 0 2 0 0
    #[5,] 0 0 2 0
    #[6,] 1 0 0 1
    
    0 讨论(0)
提交回复
热议问题