I\'ve tried to search for an answer, but can\'t seem to find the right one that does the job for me.
I have a dataset (data
) with two variables: people\
For completeness, I am adding the base R
solution to @bouncyball's great answer. I will use their synthetic data, but I will use cut
to create the age groups before aggregation.
# Creates data for plotting
> set.seed(123)
> dat <- data.frame(age = sample(20:50, 200, replace = TRUE),
awards = rpois(200, 3))
# Created a new column containing the age groups
> dat[["ageGroups"]] <- cut(dat[["age"]], c(-Inf, 20, 30, 40, Inf),
right = FALSE)
cut
will divide up a set of numeric data based on breaks defined in the second argument. right = FALSE
flips the breaks so values the groups would include the lower values rather than the upper ones (ie 20 <= x < 30
rather than the default of 20 < x <= 30
). The groups do not have to be equally spaced. If you do not want to include data above or below a certain value, simply remove the Inf
from the end or -Inf
from the beginning respectively, and the function will return
instead. If you would like to give your groups names, you can do so with the labels
argument.
Now we can aggregate
based on the groups we created.
> (summedGroups <- aggregate(awards ~ ageGroups, dat, FUN = sum))
ageGroups awards
1 [20,30) 188
2 [30,40) 212
3 [40, Inf) 194
Finally, we can plot these data using the barplot
function. The key here is to use names
for the age groups.
> barplot(summedGroups[["awards"]], names = summedGroups[["ageGroups"]])