non-equi-joins in R with data.table - backticked column name trouble

喜你入骨 提交于 2019-11-27 07:27:45

问题


I can't manage to do a non-equi-join with data.table when (backticked) column names include a space.

I collect such names from our database at work, and our explicit policy is for everyone to use those same names to avoid confusion. I could of course convert and reconvert, but I'd prefer to avoid that.

I wonder, is this a glitch in data.table, and if so, can it be remedied? Or am I missing something? I'm quite new to R, so the latter is entirely possible...

A reproducible example:

The following does work:

a <- data.table(`test name1` = c('A', 'A', 'A', 'B', 'B'),
                 `test_name2` = c(1,2,3,3,4))

b <- data.table(`test_name3` = c(0,1,2),
                 `test name4` = c('A', 'A', 'B'),
                 V2 = c(1,2,3),
                 V3 = c('Low', 'Medium', 'High'))

a[b, on = .(`test name1` = `test name4`, `test_name2` > `test_name3`, `test_name2` <= V2)]

The following does not:

a <- data.table(`test name1` = c('A', 'A', 'A', 'B', 'B'),
                 `test name2` = c(1,2,3,3,4))

b <- data.table(`test name3` = c(0,1,2),
                 `test name4` = c('A', 'A', 'B'),
                 V2 = c(1,2,3),
                 V3 = c('Low', 'Medium', 'High'))

a[b, on = .(`test name1` = `test name4`, `test name2` > `test name3`, `test name2` <= V2)]

The error message is:

Error in [.data.table(a, b, on = .(test name1 = test name4, test name2 > : Column(s) [test name2,test name2] not found in x

sessionInfo():

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252  LC_CTYPE=Norwegian (Bokmål)_Norway.1252    LC_MONETARY=Norwegian (Bokmål)_Norway.1252
[4] LC_NUMERIC=C                               LC_TIME=Norwegian (Bokmål)_Norway.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.4

回答1:


Specifying on= with strings is another option:

a[b, on = c("test name1==test name4", "test name2>test name3", "test name2<=V2")]

I think this works only if there is no whitespace around the equality/inequality operators and == is used instead of =.

I'm not sure if there's a way to write the on= along the lines of the OP's code, though it seems like there should be.



来源:https://stackoverflow.com/questions/51856443/non-equi-joins-in-r-with-data-table-backticked-column-name-trouble

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!