Force R to plot histogram as probability (relative frequency)

前端 未结 5 2030
面向向阳花
面向向阳花 2021-01-31 05:40

I am having trouble plotting a histogram as a pdf (probability)

I want the sum of all the pieces to equal an area of one so it\'s easier to compare across datasets. For

5条回答
  •  遥遥无期
    2021-01-31 06:11

    The default number of breaks is around log2(N) where N is 6 million in your case, so should be 22. If you're only seeing 4 breaks, that could be because you have xlim in your call. This doesn't change the underlying histogram, it only affects which part of it is plotted. If you do

    h <- hist(data[,1], freq=FALSE, breaks=800)
    sum(h$density * diff(h$breaks))
    

    you should get a result of 1.


    The density of your data is related to its units of measurement; therefore you want to make sure that "no bin height should be above 1.0" is actually meaningful. For example, suppose we have a bunch of measurements in feet. We plot the histogram of the measurements as a density. We then convert all the measurements to inches (by multiplying by 12) and do another density-histogram. The height of the density will be 1/12th of the original even though the data is essentially the same. Similarly, you could make your bin heights all less than 1 by multiplying all your numbers by 15.

    Does the value 1.0 have some kind of significance?

提交回复
热议问题