问题
I would like to increase the speed of plotting, and I am happy with (and have lots of code requiring) the R graphics and ggplot packages - so I am only interested in knowing how I can configure my system to speed up plotting.
Specifically:
- Is the speed of plotting in R limited by the processor, memory, graphics card?
- Are there particular hardware components or configurations would increase plotting speed?
Update: Answers to questions in comments:
specs: Ubuntu 11.04, intel Core Duo, 8GB ram, but I am more generally interested in wether the graphical computation or the graphical rendering is limiting, and if so, how can I use this information.
My plots have lots of objects, but I have no idea what the computational costs of plotting is. I don't do any specific analyses while plotting (I am plotting after completing any required analyses), although I understand that some is done 'on the fly', as when plotting a smoothed line or even translating data into locations.
回答1:
Unless you have computer-intensive single plots, a great way to speed up multiple plotting is with parallel processing. For example, suppose you have a dataframe and you want to break it down by a certain variable (or variables) and do plots for each partition.
There are many ways to register a parallel backend so I won't go into that. See, for example, this vignette: http://cran.r-project.org/web/packages/doSMP/vignettes/gettingstartedSMP.pdf
Then check out the function ddply
in Hadley's plyr
package and use the .parallel = TRUE
option. That's basically it. Then just do plotting normally.
Here's a self-contained example:
#this is the particular library I chose to register a parallel backend. There are others. See the new "Parallel R" book for details.
library(doMC)
registerDoMC()
getDoParWorkers() #This lists how many workers you have (hopefully more than 1!)
library(ggplot2)
ddply(
mtcars, .variables = "vs", .fun = function(x) {
#do your plotting now
example_plot <- ggplot(x, aes(y = mpg, x = wt)) + geom_point() + geom_smooth(se = FALSE)
#save your plot
ggsave(paste(x$vs[1],".pdf",sep = ""), example_plot)
},
.parallel = TRUE
)
This will save two files, 0.pdf and 1.pdf, which are the levels (ie the unique values) of the vs
variable of the mtcars
dataframe. If you broke it down by a variable country name
then the files saved would be the names of the countries. 0.pdf and 1.pdf are as below:
回答2:
As @Xu Wang points out, you can use parallelization to draw several plots at once.
So hardware wise, a powerful fast multi-core machine with plenty of RAM would help a bit.
If you want to plot a single plot with, say, 1 million circles in an x-y plot (scatter plot), then graphics hardware acceleration would be very beneficial.
But a fast graphics card only helps if the graphics devices in R support hardware acceleration. Currently they do not - and as @hadley points out, ggplot
uses the standard graphics devices.
The rgl
package apparently uses OpenGL to do 3D-graphics. Haven't tried it though. You might be able to use it to draw some plots more efficiently...
I have some experience creating fast interactive hardware accelerated plots (2d and 3d), and it can be magnitudes faster. The 2d-plots are actually harder to accelerate than the 3d ones... Probably not an easy thing to plug into R's current graphics device concept though.
UPDATE I just tried rgl
and its plot3d
with 1 million points. It is fully interactive (small fractions of a second to update) on my (rather powerful) laptop.
library(rgl)
x <- sort(rnorm(1e6))
y <- rnorm(1e6)
z <- rnorm(1e6) + atan2(x,y)
plot3d(x, y, z, col=rainbow(1000))
来源:https://stackoverflow.com/questions/8364288/what-hardware-limits-plotting-speed-in-r