I was trying to develop a program in R to estimate a Spearman correlation with Rcpp. I did it, but it only works with matrix with less of a range between 45 00 - 50 000 vectors.
To repeat more succintly:
You can have more than 2^31-1 elements in a vector.
Matrices are vectors with dim
attributes.
You can have more than 2^31-1 elements in a matrix (ie n
times k
)
Your row and column index are still limited to 2^31.
Example of a big vector:
R> n <- .Machine$integer.max + 100
R> tmpVec <- 1:n
R> length(tmpVec)
[1] 2147483747
R> newVec <- sqrt(tmpVec)
R>
Before we get started, I'm assuming:
R > 3.0.0
Rcpp > 0.12.0
int
and size_t
with R_xlen_t
and R_xlength
. See release post for more details...NumericMatrix
I think you may be running into a memory allocation issue...
As the following works on my 32gb machine:
Rcpp::cppFunction("NumericMatrix make_matrix(){
NumericMatrix m(50000, 50000);
return m;
}")
m = make_matrix()
object.size(m)
## 20000000200 bytes # about 20.0000002 gb
Running:
# Creates an 18.6gb matrix!!!
m = matrix(0, ncol = 50000, nrow = 50000)
Rcpp::cppFunction("void get_length(NumericMatrix m){
Rcout << m.nrow() << ' ' << m.ncol();
}")
get_length(m)
## 50000 50000
object.size(m)
## 20000000200 bytes # about 20.0000002 gb
In theory, you are bounded by the total number of elements in the matrix being less than (2^31 - 1)^2 = 4,611,686,014,132,420,609 per:
Arrays (including matrices) can be based on long vectors provided each of their dimensions is at most 2^31 - 1: thus there are no 1-dimensional long arrays.
See Long Vector
Now, fitting into a matrix:
m = matrix(nrow=2^31, ncol=1)
Error in matrix(nrow = 2^31, ncol = 1) : invalid 'nrow' value (too large or NA)
In addition: Warning message: In matrix(nrow = 2^31, ncol = 1) :
NAs introduced by coercion to integer range
The limit both R and Rcpp adhere to regarding the column/row is:
.Machine$integer.max
## 2147483647
Note that by 1 number we have:
2^31 = 2,147,483,648 > 2,147,483,647 = .Machine$integer.max
However, the limit associated with a pure atomic vector is given as 2^52 (even though it should be in the ballpark of 2 ^ 64 - 1). Thus, we have the following example which illustrates the ability to access 2^32 by concatenating two vectors of 2^31 + 2^31:
v = numeric(2^31)
length(v)
## [1] 2147483648
object.size(v)
## 17179869224 bytes # about 17.179869224 gb
v2 = c(v,v)
length(v2)
## 4294967296
object.size(v2)
## 34359738408 bytes # about 34.359738408 gb