Wrapping a function around multiple SQL queries in R?

社会主义新天地 提交于 2019-12-21 21:42:49

问题


I have some SQL queries that basically parse a dataset by time (POSIXct date format):

library(sqldf)
data_2013 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2013-01-01' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '2013-12-31'")

data_2012 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2012-01-01' AND strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') <= '2012-12-31'")

data_2011 <- sqldf("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
'unixepoch', 'localtime') >= '2011-01-01' AND strftime('%Y-%m-%d', time, 
'unixepoch', 'localtime') <= '2011-12-31'")

However, this code seems very clumsy to me. Is there a neat way of wrapping this up into a function or some other way of making it shorter, while still spitting out the same 3 separate datasets?


回答1:


With paste0 you can achieve this:

sqlfun <- function(startdate,stopdate){
sqldf(paste0("SELECT * FROM data WHERE strftime('%Y-%m-%d', time,
    'unixepoch', 'localtime') >= '",startdate,"' AND strftime('%Y-%m-%d', time,
    'unixepoch', 'localtime') <= '",stopdate,"'"))
}

sqlfun('2013-01-01','2013-12-31')



回答2:


between and fn$ Use between and factor out the strptime expression by prefacing sqldf with fn to perform string interpolation:

Time <- "strftime('%Y-%m-%d', time, 'unixepoch', 'localtime')"
st <- '2013-01-01'
en <- '2013-12-31'
fn$sqldf("select * from data where $Time between '$st' AND '$en' ")

If desired this could readily be made into a function as could the remaining solutions.

Year In the case of a year it can be simplified like this:

Year <- "strftime('%Y', time, 'unixepoch', 'localtime')"
yr <- '2013'    
sql <- "select * from data where $Year = '$yr' "  
fn$sqldf(sql)

We could create a list of data frames like this:

Map(function(yr) fn$sqldf(sql), as.character(2011:2013))

R/sqldf Another possibility is to add a character column in R first:

data$Year <- format(data$time, "%Y")
yr <- '2013'    
sql <- "select * from data where Year = '$yr' "
fn$sqldf(sql)

R Note that its not that hard to do this directly in R:

yr <- "2013"
subset(data, format(time, "%Y") == yr)

Also to split it into a list of data frames, one per year:

split(data, format(data$time, "%Y"))

H2 sqldf can also work with certain other databases. The problem with SQLite is that it has no date/time type but the H2 database directly supports date/times as a type so it greatly simplifies. If sqldf sees that RH2 is loaded it will use it rather than SQLite:

library(RH2)
library(sqldf) 
yr <- 2013
sql <- "select * from data where year(time) = $yr"
fn$sqldf(sql)


来源:https://stackoverflow.com/questions/19636760/wrapping-a-function-around-multiple-sql-queries-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!