问题
I'm trying to group the variable according to its values and get a histogram.
For example, this is my data:
r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,
3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)
I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?
If I want to use le package ggplot2, code as following:
ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))
It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?
回答1:
In base R
r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5,
3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)
cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE)
xy <- data.frame(r, cut = cut.vals)
barplot(table(xy$cut))
Note that I added the xy
variable to ease in comparing how values were grouped. You can directly put cut.vals
into the barplot(table())
.
To use ggplot2
, you can pre-calculate all the bins and plot
ggplot(xy, aes(x = cut)) +
theme_bw() +
geom_bar() +
scale_x_discrete(drop = FALSE)
geom_histogram
's most common parameter that controls bin size is binwidth
, which is constant for all bins.
来源:https://stackoverflow.com/questions/31289699/group-the-variable-according-to-its-value-and-get-a-histogram