问题
In psychology, it's common to display histograms with an overlaying normal curve. Also showing the density of the observed values with geom_line would facilitate comparison to the normal curve, so I wrote another histogram function that does this (powerHist
in the userfriendlyscience
package). However, it performs very slowly for large vectors (currently working with 16.7 million datapoints), so I'm trying to make it faster. I used to use density
to manually compute the density estimates, and then multiply them with maximum number of datapoints in a bin to scale it to match the histogram.
But this is very slow, plus, I figured ggplot2 should be able to do this. One of the variables computed by stat_density
is ..scaled..
, which is the density estimate scaled to a max of 1. Now I just need to multiply this. But ggplot2 won't find the variable I use. Multiplying it with a constant works fine, but whether I place the variable in the dataframe I pass on to ggplot2 or not doesn't seem to matter: ggplot2 can't find it.
scalingFactor <- max(table(cut(mtcars$mpg, breaks=20)));
dat <- data.frame(mpg = mtcars$mpg,
scalingFactor = scalingFactor);
ggplot(mtcars, aes(x=mpg)) +
geom_histogram(bins=20) +
geom_line(aes(y=..scaled.. * scalingFactor),
stat='density', color='red');
This yields:
Error in eval(expr, envir, enclos) : object 'scalingFactor' not found
When replacing the scalingFactor
with a regular number, it works:
ggplot(mtcars, aes(x=mpg)) +
geom_histogram(bins=20) +
geom_line(aes(y=..scaled.. * 10),
stat='density', color='red');
Also, when just using scalingFactor
on its own, it also works:
ggplot(mtcars, aes(x=mpg)) +
geom_histogram(bins=20) +
geom_line(aes(y=scalingFactor ),
stat='density', color='red');
So scalingFactor
seems available; multiplication is available; and clearly ..scaled..
is available. Still, combining them seems to fail. What am I missing here? I can't find anything on 'computation with variables generated by stat' or something . . .
Has anybody run into this before? Is it known ggplot2 behavior that I just missed?
回答1:
try with aes_q(y=bquote(..scaled.. * .(scalingFactor)))
(although I would think there is a bug somewhere, since the environment argument in ?ggplot suggests this shouldn't be needed, and in fact isn't needed when dealing with variables that don't come from a stat)
来源:https://stackoverflow.com/questions/39544577/how-can-i-transform-aesthetics-on-the-fly-in-ggplot-using-variables-inside-or