How to compare the rows of two dataframes in R

问题

I'm trying to compare two columns of different data frames to create a new data frame. If the value of the row of the first col is less than the second, it will add a 1 to the new column. When the value is greater, it will add a 2 and so on.

I'll give you an example. I have this df

df1 <- data.frame(col=c(1,seq(1:9),9,10))
# col
# 1    1
# 2    1
# 3    2
# 4    3
# 5    4
# 6    5
# 7    6
# 8    7
# 9    8
# 10   9
# 11   9
# 12  10

And this one, which has less rows

df2<-data.frame(col2=c(3,6,8))
#    col2
# 1    3
# 2    6
# 3    8

Now, my desire output would be something similar to this:

#      col3
# 1     1
# 2     1
# 3     1
# 4     2
# 5     2
# 6     2
# 7     3
# 8     3
# 9     4
# 10    4
# 11    4
# 12    4

I know this is a very basic question, but I'm not getting how to do this easily withouth using a for loop. I though about using !unique() to select the first element and see if its in the second with %in%but don't know how to implement it.

回答1:

If I understand you correctly, I think this would work:

apply(df1, 1, FUN = function(x) 1 + sum(x >= df2$col2))
# [1] 1 1 1 2 2 2 3 3 4 4 4 4

We use apply to iterate over the rows of df1, and then check the value in each row to see how it compares to col2 in df2.

A dplyr alternative:

library(dplyr)
df1 %>%
    rowwise() %>% # group over each row
    mutate(col3 = 1 + sum(col >= df2$col2))

     col  col3
   <dbl> <dbl>
 1     1     1
 2     1     1
 3     2     1
 4     3     2
 5     4     2
 6     5     2
 7     6     3
 8     7     3
 9     8     4
10     9     4
11     9     4
12    10     4

回答2:

Hope this can hellp you

z <- rep(F,nrow(df1))
z[c(1,df2$col2+1)]<- T
df1$col3 <- cumsum(z)

which gives

> df1
   col col3
1    1    1
2    1    1
3    2    1
4    3    2
5    4    2
6    5    2
7    6    3
8    7    3
9    8    4
10   9    4
11   9    4
12  10    4

来源：https://stackoverflow.com/questions/58939036/how-to-compare-the-rows-of-two-dataframes-in-r

标签

dataframe

dplyr