问题
I am trying to create a new column that only shows the winning team.
Here is some sample data:
results <- data.frame(
home_team = c("Scotland", "England", "Scotland", "England", "Scotland", "Scotland",
"England", "Wales", "Scotland", "Scotland", "England"),
away_team = c("England", "Scotland", "England", "Scotland", "England", "Wales",
"Scotland", "Scotland", "England", "Wales", "Wales"),
home_score = c(0, 4, 2, 2, 3, 4, 1, 0, 7, 9, 2),
away_score = c(0, 2, 1, 2, 0, 0, 3, 2, 2, 0, 1),
stringsAsFactors = FALSE
)
this is my code so far:
results <- intl.football.results
first6home <- head(results$home_team)
first6away <- head(results$away_team)
homescore <- (results$home_score)
awayscore <- (results$away_score)
data.frame('winning_team' = 0, results)
for (i in 1:length(results)){
if(homescore[i] > awayscore[i]){
homewins <- print("home wins")
}else if(homescore[i] == awayscore[i]){
draw <- print("draw")
}else{
awaywins <- print("away team wins")
}
}
I am thinking that I need to somehow rectify the "homewins" to the home_team. The best way I can think of this is by finding the row number of the "homewins" then selecting the rows that the home_team is in. But how do I do this if the data.frame has 30,000+ rows? Sorry this may sound basic but I'm trying!
Thank you everyone for the responses, I will definitely practice them. One last thing, what if I wanted to print out the winning country column and not "home, away, or draw"?
回答1:
The case_when function in dplyr might be a good way to solve this. It seems pretty close to what you're trying to do above, so hopefully its quite intuitive.
Documentation and more examples: https://dplyr.tidyverse.org/reference/case_when.html
I'm passing the name of the the winning team from the corresponding row as the action to take in the case_when, but you can pass in a character string, e.g. 'Home Win', as I've done for the drawn games, if thats the outcome you want.
library(tidyverse)
d <- tibble(
home_team = c('Scotland', 'England', 'Scotland', 'England',
'Scotland', 'Scotland', 'England', 'Wales'),
away_team = c('England', 'Scotland', 'England', 'Scotland',
'England', 'Wales', 'Scotland', 'Scotland'),
home_score = c(0, 4, 2, 2, 3, 4, 1, 0),
away_score = c(0, 2, 1, 2, 0, 0, 3, 2))
d %>%
mutate(winner = case_when(
home_score > away_score ~ home_team,
away_score > home_score ~ away_team,
away_score == home_score ~ 'Drawn Game'))
回答2:
One solution could be to use the data.table
package to handle your data. Using this package, the solution to your problem would be (assuming a tie would result in "T", away win in a "A" and home win in a "H")
library(data.table)
setDT(results)
results[
, w_team := "T"][
home_score > away_score, w_team := "H"][
home_score < away_score, w_team := "A"]
回答3:
Base R:
results$who_wins <- with(results,
ifelse(home_score > away_score, "home wins",
ifelse(home_score < away_score, "away wins", "draw")))
results
# home_team away_team home_score away_score who_wins
# 1 Scotland England 0 0 draw
# 2 England Scotland 4 2 home wins
# 3 Scotland England 2 1 home wins
# 4 England Scotland 2 2 draw
# 5 Scotland England 3 0 home wins
# 6 Scotland Wales 4 0 home wins
# 7 England Scotland 1 3 away wins
# 8 Wales Scotland 0 2 away wins
# 9 Scotland England 7 2 home wins
# 10 Scotland Wales 9 0 home wins
# 11 England Wales 2 1 home wins
回答4:
Just for fun, you could also do this by calculating the sign of the score difference and then matching to a lookup vector
lookup <- c('home' = 1, 'away' = -1, 'draw' = 0)
results$winner <-
with(results, names(lookup)[match(sign(home_score - away_score), lookup)])
results
# home_team away_team home_score away_score winner
# 1 Scotland England 0 0 draw
# 2 England Scotland 4 2 home
# 3 Scotland England 2 1 home
# 4 England Scotland 2 2 draw
# 5 Scotland England 3 0 home
# 6 Scotland Wales 4 0 home
# 7 England Scotland 1 3 away
# 8 Wales Scotland 0 2 away
# 9 Scotland England 7 2 home
# 10 Scotland Wales 9 0 home
# 11 England Wales 2 1 home
来源:https://stackoverflow.com/questions/58013908/create-a-new-column-that-only-shows-the-winning-team