问题
I am using read.csv.sql
from the package sqldf
to try and read in a subset of rows, where the subset selects from multiple values - these values are stored in another vector.
I have hacked a way to a form that works but I would like to see the correct way to pass the sql
statement.
Code below gives minimum example.
library(sqldf)
# some data
write.csv(mtcars, "mtcars.csv", quote = FALSE, row.names = FALSE)
# values to select from variable 'carb'
cc <- c(1, 2)
# This only selects last value from 'cc' vector
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb = ", cc ))
# So try using the 'in' operator - this works
read.csv.sql("mtcars.csv", sql = "select * from file where carb in (1,2)" )
# but this doesn't
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", cc ))
# Finally this works
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ",
paste("(", paste(cc, collapse=",") ,")")))
The final line above works, but is there are cleaner way to pass this statement, thanks.
回答1:
1) fn$ Substitution can be done with fn$
of gsubfn (which is automatically pulled in by sqldf). See the fn$
examples on the sqldf home page. In this case we have:
fn$read.csv.sql("mtcars.csv",
sql = "select * from file where carb in ( `toString(cc)` )")
2) join Another approach would be to create a data.frame of the carb
values desired and perform a join with it:
Carbs <- data.frame(carb = cc)
read.csv.sql("mtcars.csv", sql = "select * from Carbs join file using (carb)")
回答2:
You could use deparse
, but I'm not sure it's much cleaner than what you already have:
read.csv.sql("mtcars.csv",
sql = paste("select * from file where carb in ", gsub("c","",deparse(cc)) ))
And note that this is not really a general solution, because deparse
will not always give you the right character string. It just happens to work in this instance.
来源:https://stackoverflow.com/questions/26861951/using-read-csv-sql-to-select-multiple-values-from-a-single-column