selecting every Nth column in using SQLDF or read.csv.sql

守給你的承諾、 提交于 2019-12-08 03:30:16

问题


I am rather new to using SQL statements, and am having a little trouble using them to select the desired columns from a large table and pulling them into R.

I want to take a csv file and read selected columns into r, in particular, every 9th and 10th column. In R, something like:

read.csv.sql("myfile.csv", sql(select * from file [EVERY 9th and 10th COLUMN])

My trawl of the internet suggests that selecting every nth row could be done with an SQL statement using MOD something like this (please correct me if I am wrong):

"SELECT *
        FROM   file
        WHERE  (ROWID,0) IN (SELECT ROWID, MOD(ROWNUM,9) OR MOD(ROWNUM,10)"

Is there a way to make this work for columns? Thanks in advance.


回答1:


read.csv read.csv would be adequate for this:

# determine number of columns
DF1 <- read.csv(myfile, nrows = 1)
nc <- ncol(DF1)

# create a list nc long where unwanted columns are NULL and wanted are NA
colClasses <- rep(rep(list("NULL", NA), c(8, 2)), length = nc)

# read in
DF <- read.csv(myfile, colClasses = colClasses)

sqldf To use sqldf replace the last line with these:

nms <- names(DF1)
vars <- toString(nms[is.na(colClasses)])
DF <- fn$read.csv.sql(myfile, "select $vars from file")

UPDATE: switched to read.csv.sql

UPDATE 2: correction.



来源:https://stackoverflow.com/questions/15373466/selecting-every-nth-column-in-using-sqldf-or-read-csv-sql

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!