问题
I have some SQL queries that basically parse a dataset by time (POSIXct date format):
library(sqldf)
data_2013 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2013-01-01' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '2013-12-31'")
data_2012 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2012-01-01' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '2012-12-31'")
data_2011 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2011-01-01' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '2011-12-31'")
However, this code seems very clumsy to me. Is there a neat way of wrapping this up into a function or some other way of making it shorter, while still spitting out the same 3 separate datasets?
回答1:
With paste0
you can achieve this:
sqlfun <- function(startdate,stopdate){
sqldf(paste0("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '",startdate,"' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '",stopdate,"'"))
}
sqlfun('2013-01-01','2013-12-31')
回答2:
between and fn$ Use between
and factor out the strptime
expression by prefacing sqldf
with fn
to perform string interpolation:
Time <- "strftime('%Y-%m-%d', time, 'unixepoch', 'localtime')"
st <- '2013-01-01'
en <- '2013-12-31'
fn$sqldf("select * from data where $Time between '$st' AND '$en' ")
If desired this could readily be made into a function as could the remaining solutions.
Year In the case of a year it can be simplified like this:
Year <- "strftime('%Y', time, 'unixepoch', 'localtime')"
yr <- '2013'
sql <- "select * from data where $Year = '$yr' "
fn$sqldf(sql)
We could create a list of data frames like this:
Map(function(yr) fn$sqldf(sql), as.character(2011:2013))
R/sqldf Another possibility is to add a character column in R first:
data$Year <- format(data$time, "%Y")
yr <- '2013'
sql <- "select * from data where Year = '$yr' "
fn$sqldf(sql)
R Note that its not that hard to do this directly in R:
yr <- "2013"
subset(data, format(time, "%Y") == yr)
Also to split it into a list of data frames, one per year:
split(data, format(data$time, "%Y"))
H2 sqldf can also work with certain other databases. The problem with SQLite is that it has no date/time type but the H2 database directly supports date/times as a type so it greatly simplifies. If sqldf sees that RH2 is loaded it will use it rather than SQLite:
library(RH2)
library(sqldf)
yr <- 2013
sql <- "select * from data where year(time) = $yr"
fn$sqldf(sql)
来源:https://stackoverflow.com/questions/19636760/wrapping-a-function-around-multiple-sql-queries-in-r