问题
I have data where I need to create a variable based on prior history, for example
created<- c(2009,2010,2010,2011, 2012, 2011)
person <- c(A, A, A, A, B, B)
location<- c('London','Geneva', 'London', 'New York', 'London', 'London')
df <- data.frame (created, person, location)
I want to create a variable called 'existing' that takes into account the prior years and sees if he/she has lived in that place and gives a value of 0 if the place is old(and they lived there. Any suggestions?
library(dplyr)
df %>% group_by(person) %>% mutate (existing=0)
existing<- c(1, 1, 0, 1, 0,1)
回答1:
Another dplyr
option could be:
df %>%
group_by(person, location) %>%
mutate(existing = +(1:n() == 1))
created person location existing
<dbl> <fct> <fct> <int>
1 2009 A London 1
2 2010 A Geneva 1
3 2010 A London 0
4 2011 A New York 1
5 2012 B London 1
6 2011 B London 0
If sorting is required:
df %>%
group_by(person, location) %>%
arrange(created, .by_group = TRUE) %>%
mutate(existing = +(1:n() == 1))
回答2:
You can try,
with(df, ave(location, person, FUN = function(i)as.integer(!duplicated(i))))
#[1] "1" "1" "0" "1" "1" "0"
回答3:
Based on the updated information from OP, we need to first arrange
the data by person
and year (created
) and then use duplicated
.
library(dplyr)
df %>%
arrange(person, created) %>%
group_by(person) %>%
mutate(existing = +(!duplicated(location)))
# created person location existing
# <dbl> <fct> <fct> <int>
#1 2009 A London 1
#2 2010 A Geneva 1
#3 2010 A London 0
#4 2011 A New York 1
#5 2011 B London 1
#6 2012 B London 0
回答4:
another option using data.table
:
setDT(df)[order(person, created), existing := c(1L, rep(0L, .N-1L)), .(person, location)]
output:
created person location existing
1: 2009 A London 1
2: 2010 A Geneva 1
3: 2010 A London 0
4: 2011 A New York 1
5: 2012 B London 0
6: 2011 B London 1
来源:https://stackoverflow.com/questions/59091030/creating-a-new-variable-based-on-prior-history