I have been using the R which
function to remove rows from a data frame. I recently discovered that if the search term is NOT in the data.frame, the result is an e
Because of this:
which(LETTERS == '-1')
## integer(0)
and this:
(1:2)[integer(0)]
integer(0)
Instead of #4, use this:
LETTERS[LETTERS != "R"]
In example 2, which
returns integer(0)
(a zero-length integer vector) because no values are TRUE
. A negative zero-length vector (-integer(0)
) is still a zero-length vector. So you're essentially asking for the NULL
element of LETTERS
, which doesn't exist.
That is a well-known pitfall. When nothing matches the logical test the which-function returns numeric(0) and then "[" returns nothing instead of returning everything which would be expected. You can use:
LETTERS[ ! LETTERS == "1" ]
LETTERS[ ! LETTERS %in% "1" ]
There is another gotcha to be aware of and is the one that makes me choose to use which(). When using logical indexing an NA value used inside "[" will return a row. I generally do not want that so I use DFRM[ which(logical) ]
although this seems to bother some people who say is is not needed. I just think they are working with small datasets and infrequently encounter the annoyance of seeing tens of thousands of NA-induced useless lines of output on their console. I never use the negated which version though.