问题
I have a data frame as follows:
structure(list(`104` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "yes", NA, NA, NA, NA), `15` = c(NA,
NA, NA, NA, ">= 4.0", ">= 4.0", NA, "~ 2", "~ 2", "~ 2", "~ 2",
"~ 2", "~ 2", "< 2.2", "~2.75", NA, "~2.75", "~2.75", "~2.75",
"~2.75")), .Names = c("104", "15"), row.names = 45:64, class = "data.frame")
I know that it is not best practices to have numeric column names, however it is necessary in this circumstance. I have been manipulating my data frame through retrieving columns with a backtick `
Unfortunately, I found something funny in the above data frame.
> table(testtest$`10`)
yes
1
>
However there is no column with a name of 10, so it looks like it is retrieving
> table(testtest$`104`)
yes
1
>
I am nervous now, and do not trust that this may pop up again without my knowing for other columns such as 41
and 4100
.
Any explanation would be helpful! Thanks
回答1:
This is due to the partial matching. To avoid it, use [[
to extract the columns
testtest[["10"]]
#NULL
while the correct column name gives the output
testtest[["104"]]
#[1] NA NA NA NA NA NA NA NA NA NA NA
#[12] NA NA NA NA "yes" NA NA NA NA
According to ?"$"
Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.
In general, it is better not to have a numeric column name or names that start with numbers. We can append with a non-numeric character "X" with the convenient function make.names
names(testtest) <- make.names(names(testtest))
names(testtest)
#[1] "X104" "X15"
来源:https://stackoverflow.com/questions/39299665/numeric-column-names-in-r