问题
I have spent a couple of hours researching this but I am getting nowhere unfortunately. I am trying to get a subset of data by using sqldf to query a data frame, result.
This is the structure of result:
> str(result)
'data.frame': 316125 obs. of 6 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ dt : Date, format: "1999-12-31" "1999-12-31" "1999-12-31" "1999-12-31" ...
$ Ticker: chr "0111145D US" "0113357D US" "0202445Q US" "0203524D US" ...
$ px : num 32.5 20.6 34.2 21.4 11 ...
$ High : num 34.9 23.5 35.4 25.9 11 ...
$ Low : num 31.19 18 28.85 20.28 9.97 ...
And also:
> head(result)
ID dt Ticker px High Low
1 1 1999-12-31 0111145D US 32.5000 34.9375 31.1875
2 2 1999-12-31 0113357D US 20.5625 23.5000 18.0000
3 3 1999-12-31 0202445Q US 34.1542 35.3700 28.8487
4 4 1999-12-31 0203524D US 21.4063 25.9375 20.2813
5 5 1999-12-31 0226226D US 11.0019 11.0297 9.9713
6 6 1999-12-31 0352887Q US 31.1048 32.9863 29.4584
I am able to use sqldf to do a non-date query (to my relief):
> q1 <- sqldf("SELECT * FROM result WHERE Ticker IN ('0111145D US','0113357D US')")
> head(q1)
ID dt Ticker px High Low
1 1 1999-12-31 0111145D US 32.5000 34.9375 31.1875
2 2 1999-12-31 0113357D US 20.5625 23.5000 18.0000
3 1741 2000-01-31 0111145D US 34.2500 36.3750 31.3125
4 1742 2000-01-31 0113357D US 18.6875 21.1875 18.3125
5 3485 2000-02-29 0111145D US 30.3750 35.6875 29.6875
6 3486 2000-02-29 0113357D US 17.0625 19.7500 16.1250
However, when I try including a date condition, things fall apart:
> result[1, "dt"] == '1999-12-31'
[1] TRUE
> sqldf(paste("SELECT * FROM result WHERE dt=", as.Date("1999-12-31"), "", sep=""))
[1] ID dt Ticker px High Low
<0 rows> (or 0-length row.names)
> sqldf("SELECT * FROM result WHERE dt='1999-12-31'")
[1] ID dt Ticker px High Low
<0 rows> (or 0-length row.names)
Can someone provide a pointer on how to include a date condition in sqldf?
Additionally, how would I write a query if the date format was POSIXct?
e.g.
> str(result)
'data.frame': 316125 obs. of 6 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ dt : POSIXct, format: "1999-12-31" "1999-12-31" "1999-12-31" "1999-12-31" ...
$ Ticker: chr "0111145D US" "0113357D US" "0202445Q US" "0203524D US" ...
$ px : num 32.5 20.6 34.2 21.4 11 ...
$ High : num 34.9 23.5 35.4 25.9 11 ...
$ Low : num 31.19 18 28.85 20.28 9.97 ...
回答1:
My solution to this problem is to convert your date variable to character and querying on the new character variable:
## create new character version of dt variable
result$chardt <- as.character(result$dt)
## query in sqldf on chardt instead
sqldf("SELECT * FROM result WHERE chardt='1999-12-31'")
来源:https://stackoverflow.com/questions/28996390/sqldf-how-to-query-based-on-a-date-condition