How can I create a histogram from aggregated data in R?

前端 未结 4 1605
一向
一向 2020-12-10 05:30

I have a data frame that has a format like the following:

Month       Frequency
2007-08     2
2010-11     5
2011-01     43
2011-02     52
2011-03     31
2011         


        
相关标签:
4条回答
  • 2020-12-10 06:10

    Yea, rep solutions will waste too much memory in most interesting/large cases. The HistogramTools CRAN package includes an efficient PreBinnedHistogram function which creates a base R histogram object directly from a list of bins and breaks as the original question provided.

    0 讨论(0)
  • 2020-12-10 06:17

    To get this kind of flexibility, you may have to replicate your data. Here is one way of doing it with rep:

    n <- 10
    dat <- data.frame(
        x = sort(sample(1:50, n)),
        f = sample(1:100, n))
    dat
    
    expdat <- dat[rep(1:n, times=dat$f), "x", drop=FALSE]
    

    Now you have your data replicated in the data.frame expdat, allowing you to call hist with different numbers of bins:

    par(mfcol=c(1, 2))
    hist(expdat$x, breaks=50, col="blue", main="50 bins")
    hist(expdat$x, breaks=5, col="blue", main="5 bins")
    par(mfcol=c(1, 1))
    

    enter image description here

    0 讨论(0)
  • 2020-12-10 06:18

    take a gander at ggplot2.

    if you data is in a data.frame called df:

    ggplot(df,aes(x=Month,y=Frequency))+geom_bar(stat='identity')
    

    or if you want continuous time:

    df$Month<-as.POSIXct(paste(df$Month, '01', sep='-'),format='%Y-%m-%d')
    ggplot(df,aes(x=Month,y=Frequency))+geom_bar(stat='identity')
    
    0 讨论(0)
  • 2020-12-10 06:20

    Another possibility is to scale down your frequency variable by some large factor so that rep doesn't have as much work to do. Then adjust the vertical axis scale of the histogram by that same factor.

    0 讨论(0)
提交回复
热议问题