问题
I have two data frames. Below are samples but should be easily reproducible for illustration.
df1 <- data.frame(School = c("Omaha South", "Omaha Central", "Grand Island"),
Enrollment = c(2166, 2051, 1982))
df2 <- data.frame('Away Score' = c(25, 57, 76),
'Away Team' = c("Omaha South", "Omaha Central", "Grand Island"),
'Away Score' = c(52, 88, 69),
'Away Team' = c("Omaha Central", "Grand Island", "Omaha South"),
Date = c("1/11/2020", "1/12/2020", "1/13/2020"),
Winner = c("Omaha Central", "Grand Island", "Grand Island"),
Loser = c("Omaha South", "Omaha Central", "Omaha South"))
My goal is to create a new column in df1 called "Wins" that looks up the school in df1 and then counts the number of times that school is listed in the "Winner" column of df2.
So want df1 to look like this:
df1 <- data.frame(School = c("Omaha South", "Omaha Central", "Grand Island"),
Enrollment = c(2166, 2051, 1982),
Wins = c(0, 1, 2))
I've tried a number of solutions to no avail, including sqldf. My latest attempt was the below but it gives me an error saying no applicable method for 'group_by_' applied to an object of class "NULL"
df$Wins %>%
group_by(df2$Winner) %>%
mutate(Count=n())
回答1:
One way using dplyr
and joins :
library(dplyr)
df1 %>%
left_join(df2, by = c('School' = 'Winner')) %>%
na.omit() %>%
count(School, name = "wins") %>%
right_join(df1) %>%
mutate(wins = replace(wins, is.na(wins), 0))
Using base R we calculate the frequency of wins using table
, convert it into dataframe using stack
and then merge
to df1
.
merge(df1, stack(table(factor(df2$Winner, levels = df1$School))),
by.x = 'School', by.y = "ind")
来源:https://stackoverflow.com/questions/59925742/vlookup-and-count-string-occurrences-in-separate-table-r-to-new-column