Is it okay to use floating-point numbers as indices or when creating factors in R?
I don\'t mean numbers with decimal parts; that would clearly be odd, but instead n
It's always better to use integer representation when you can. For instance, with (1L:3L)*3L
or seq(3L,9L,by=3L)
.
I can come up with an example where floating representation gives an unexpected answer, but it depends on actually doing floating point arithmetic (that is, on the decimal part of a number). I don't know if storing an integer directly in floating point and possibly then doing multiplication, as in the two examples in the original post, could ever cause a problem.
Here's my somewhat forced example to show that floating points can give funny answers. I make two 3's that are different in floating point representation; the first element isn't quite exactly equal to three (on my system with R 2.13.0, anyway).
> (a <- c((0.3*3+0.1)*3,3L))
[1] 3 3
> a[1] == a[2]
[1] FALSE
Creating a factor directly works as expected because factor
calls as.character
on them which has the same result for both.
> as.character(a)
[1] "3" "3"
> factor(a, levels=1:3, labels=LETTERS[1:3])
[1] C C
Levels: A B C
But using it as an index doesn't work as expected because when they're forced to an integer, they are truncated, so they become 2 and 3.
> trunc(a)
[1] 2 3
> LETTERS[a]
[1] "B" "C"
Constructs such as 1:3
are really integers:
> class(1:3)
[1] "integer"
Using a float as an index entails apparently some truncation:
> foo <- 1:3
> foo
[1] 1 2 3
> foo[1.0]
[1] 1
> foo[1.5]
[1] 1