I have an R dataframe with two levels of data: id and year. Within groups defined by id, the years increase (entire dataset has the sa
id
year
subset(df, id %in% sample(levels(df$id), 20))
that's assuming your data frame is called df and that your id is a factor (use unique instead of levels if it's not)
df
unique
levels