geom_density doesn't fill correctly with scale_y_log10

限于喜欢 提交于 2019-12-24 14:08:41

问题


Code:

require(ggplot2)
set.seed(0)
xvar <- rnorm(100)
ggplot(data.frame(xvar), aes(xvar)) + geom_density(fill="lightblue") + scale_y_log10()

The graph is something like this:

How can I make the graph shade on the right side of (viz. below) the density estimate?


回答1:


The problem is that stat_density by default fills between the density and the y=0 line of the transformed data. So transformations that alter the y=0 line will fall victim to problems of this sort. I personally think this is a bug in ggplot2, although since graphical grammar experts probably argue that y-transformed densities are meaningless, the bug may not get a lot of attention.

A very kludgy workaround is to manually add an offset to ..density.., which you will have to explicitly invoke, and then change the breaks to make it look like you didn't do anything weird.

require(ggplot2)
require(scales)
set.seed(0)
xvar <- rnorm(100000)
quartz(height=4,width=6)
ggplot(data.frame(xvar), aes(x=xvar, y=log10(..density..)+4)) + 
    geom_density(fill='lightblue') +
    scale_y_continuous(breaks=c(0,1,2,3,4), 
        labels=c('0.0001', '0.001', '0.01', '0.1','1'), limits=c(0,4),
        name='density')
quartz.save('![StackOverflow_29111741_v2][1].png')

That code produces this graph:




回答2:


This isn't a ggplot2 or even an R issue but is simply an issue with the tails of a probability distribution being undersampled for your sample sizes. The log axis can go down forever, taking infinitely long to "reach" zero, but no finite sample size can ever hope to cover the increasingly improbable regions of the distribution.

So, to make the plot pretty, you need to both (a) increase the number of points from 100 to 10,000 or higher, while (b) keeping the plot ylims the same. (Otherwise the extra data you draw in your rnorm call will sparsely populate the tails of the gaussian even farther away from the mean, convincing ggplot2 to make automatic y axis limits even lower, in the range of the poorly-sampled tails, and the noisiness that you don't like will return.)

require(ggplot2)
require(scales)
set.seed(0)
xvar <- rnorm(100000)
ggplot(data.frame(xvar), aes(xvar)) + 
    geom_density(fill="lightblue") + 
    scale_y_continuous(trans=log10_trans(), limits = c(0.01, 1))

This generates this plot, which I think is what you want.



来源:https://stackoverflow.com/questions/29111741/geom-density-doesnt-fill-correctly-with-scale-y-log10

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!