The ggplot2
package is easily the best plotting system I ever worked with, except that the performance is not really good for larger datasets (~50k points). I\'m lo
Hadley had a cool talk about his new packages dplyr and ggvis at user2013. But he can probably better tell more about that himself.
I'm not sure what your application design looks like, but I often do in-database pre-processing before feeding the data to R. For example, if you are plotting time series, there is really no need to show every second of the day on the X axis. Instead you might want to aggregate and get the min/max/mean over e.g. one or five minute time intervals.
Below an example of a function I wrote years ago that did something like that in SQL. This particular example uses the modulo operator because times were stored as epoch millis. But if data in SQL are properly stored as date/datetime structures, SQL has some more elegant native methods to aggregate by time periods.
#' @param table name of the table
#' @param start start time/date
#' @param end end time/date
#' @param aggregate one of "days", "hours", "mins" or "weeks"
#' @param group grouping variable
#' @param column name of the target column (y axis)
#' @export
minmaxdata <- function(table, start, end, aggregate=c("days", "hours", "mins", "weeks"), group=1, column){
#dates
start <- round(unclass(as.POSIXct(start))*1000);
end <- round(unclass(as.POSIXct(end))*1000);
#must aggregate
aggregate <- match.arg(aggregate);
#calcluate modulus
mod <- switch(aggregate,
"mins" = 1000*60,
"hours" = 1000*60*60,
"days" = 1000*60*60*24,
"weeks" = 1000*60*60*24*7,
stop("invalid aggregate value")
);
#we need to add the time differene between gmt and pst to make modulo work
delta <- 1000 * 60 * 60 * (24 - unclass(as.POSIXct(format(Sys.time(), tz="GMT")) - Sys.time()));
#form query
query <- paste("SELECT", group, "AS grouping, AVG(", column, ") AS yavg, MAX(", column, ") AS ymax, MIN(", column, ") AS ymin, ((CMilliseconds_g +", delta, ") DIV", mod, ") AS timediv FROM", table, "WHERE CMilliseconds_g BETWEEN", start, "AND", end, "GROUP BY", group, ", timediv;")
mydata <- getquery(query);
#data
mydata$time <- structure(mod*mydata[["timediv"]]/1000 - delta/1000, class=c("POSIXct", "POSIXt"));
mydata$grouping <- as.factor(mydata$grouping)
#round timestamps
if(aggregate %in% c("mins", "hours")){
mydata$time <- round(mydata$time, aggregate)
} else {
mydata$time <- as.Date(mydata$time);
}
#return
return(mydata)
}