Using dplyr package :
library(dplyr)
data <- data.frame(personal_id = c("111-111-111", "999-999-999", "222-222-222", "111-111-111"),
gender = c("M", "F", "M", "M"),
temperature = c(99.6, 98.2, 97.8, 95.5))
first you extract the personal_id in order to create a unique ID :
cases <- data.frame(levels = levels(data$personal_id))
using rownames, you get a unique identifier :
cases <- cases %>%
mutate(id = rownames(cases))
results :
levels id
1 111-111-111 1
2 222-222-222 2
3 999-999-999 3
then you join the cases dataframe with your data :
data <- left_join(data, cases, by = c("personal_id" = "levels"))
you create a more unique ID by pasting the id generated with the gender :
mutate(UID = paste(id, gender, sep=""))
and finally remove the personal_id and the simple id :
select(-personal_id, -id)
and there you go :) :
data <- left_join(data, cases, by = c("personal_id" = "levels")) %>%
mutate(UID = paste(id, gender, sep="")) %>%
select(-personal_id, -id)
results :
gender temperature UID
1 M 99.6 1M
2 F 98.2 3F
3 M 97.8 2M
4 M 95.5 1M