问题
R seems to support an efficient NA
value in floating point arrays. How does it represent it internally?
My (perhaps flawed) understanding is that modern CPUs can carry out floating point calculations in hardware, including efficient handling of Inf, -Inf and NaN values. How does NA
fit into this, and how is it implemented without compromising performance?
回答1:
R uses NaN values as defined for IEEE floats to represent NA_real_
, Inf
and NA
. We can use a simple C++ function to make this explicit:
Rcpp::cppFunction('void print_hex(double x) {
uint64_t y;
static_assert(sizeof x == sizeof y, "Size does not match!");
std::memcpy(&y, &x, sizeof y);
Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(NA_real_)
#> 7ff80000000007a2
print_hex(Inf)
#> 7ff0000000000000
print_hex(-Inf)
#> fff0000000000000
The exponent (second till 13. bit) is all one. This is the definition of an IEEE NaN. But while for Inf
the mantissa is all zero, this is not the case for NA_real_
. Here some source
code
references.
来源:https://stackoverflow.com/questions/51684861/how-does-r-represent-na-internally