Group the variable according to its value and get a histogram

你离开我真会死。 提交于 2020-01-17 15:18:49

问题


I'm trying to group the variable according to its values and get a histogram.

For example, this is my data:

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,
     3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)

I want to group r by its value, like: 1-5, 5-10, 10-100, 100-500 and more than 500. And then I want to get a histogram which the x axis is in the type of interval (1-5,5-10,10-100,100-500 and more than 500) . How to solve that?

If I want to use le package ggplot2, code as following:

ggplot(data=r, aes(x=r))+geom_histogram(breaks = c(1, 5, 10, 100, 500,2000,Inf))

It dosen't work and R says that "missing value where TRUE/FALSE needed". And how to make the larges of bins are the same?


回答1:


In base R

r <-c(1,899,1,2525,763,3,2,2,1863,695,9,4,2876,1173,1156,5098,3,3876,1,1,5,
      3023,76336,13,003,9898,1,10,843,10546,617,1375,1,1,5679,1,21,1,13,6,28,1,14088,682)
cut.vals <- cut(r, breaks = c(1, 5, 10, 100, 500, Inf), right = FALSE)
xy <- data.frame(r, cut = cut.vals)
barplot(table(xy$cut))

Note that I added the xy variable to ease in comparing how values were grouped. You can directly put cut.vals into the barplot(table()).

To use ggplot2, you can pre-calculate all the bins and plot

ggplot(xy, aes(x = cut)) +
  theme_bw() +
  geom_bar() +
  scale_x_discrete(drop = FALSE)

geom_histogram's most common parameter that controls bin size is binwidth, which is constant for all bins.



来源:https://stackoverflow.com/questions/31289699/group-the-variable-according-to-its-value-and-get-a-histogram

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!