Computing the null space of a bigmatrix in R

会有一股神秘感。 提交于 2020-01-13 07:48:07

问题


I can not find any function or package to calculate the null space or (QR decomposition) of a bigmatrix (from library(bigmemory)) in R. For example:

library(bigmemory)

a <- big.matrix(1000000, 1000, type='double', init=0)

I tried the following but got the errors shown. How can I find the null space of a bigmemory object?

a.qr <- Matrix::qr(a)
# Error in as.vector(data) : 
#   no method for coercing this S4 class to a vector
q.null <- MASS::Null(a)
# Error in as.vector(data) : 
#   no method for coercing this S4 class to a vector

回答1:


If you want to compute the full SVD of the matrix, you can use package bigstatsr to perform computations by block. A FBM stands for a Filebacked Big Matrix and is an object similar to a filebacked big.matrix object of package bigmemory.

library(bigstatsr)
options(bigstatsr.block.sizeGB = 0.5)

# Initialize FBM with random numbers
a <- FBM(1e6, 1e3)
big_apply(a, a.FUN = function(X, ind) {
  X[, ind] <- rnorm(nrow(X) * length(ind))
  NULL
}, a.combine = 'c')

# Compute t(a) * a
K <- big_crossprodSelf(a, big_scale(center = FALSE, scale = FALSE))

# Get v and d where a = u * d * t(v) the SVD of a
eig <- eigen(K[])
v <- eig$vectors
d <- sqrt(eig$values)

# Get u if you need it. It will be of the same size of u
# so that I store it as a FBM.
u <- FBM(nrow(a), ncol(a))
big_apply(u, a.FUN = function(X, ind, a, v, d) {
  X[ind, ] <- sweep(a[ind, ] %*% v, 2, d, "/")
  NULL
}, a.combine = 'c', block.size = 50e3, ind = rows_along(u),
a = a, v = v, d = d)

# Verification
ind <- sample(nrow(a), 1000)
all.equal(a[ind, ], tcrossprod(sweep(u[ind, ], 2, d, "*"), v))

This takes approximately 10 minutes on my computer.




回答2:


@Mahon @user20650 @F.Privė For clarity I pinged the bigmemory team and asked

Essentially, is there an implementation of the QR function (QR Decomposition) that works with big memory matrixes?

I felt it useful to get clarity on the original question asked. @F.Privė - nice answer. Hopefully your answer, and their response will help guide people in the future. Their response below:

Thanks for the note. There is not currently an implementation of the qr decomposition. Ideally, you would implement this using Householder reflections (if the matrix is dense) or Givens rotations (if it is sparse).

The irlba package is compatible with bigmemory. It provides a truncated singular value decomposition. So, if your matrix is relatively sparse, you could truncate at the rank of the matrix. This is probably your best option. If you don't know the rank then you can use the package to update the truncation iteratively.

Please note that if your matrix is (tall and skinny or short and fat) then the SO solution is OK. However, anytime you resort to calculating the cross-product you lose some numerical stability. This can be an issue if you are planning on inverting the matrix.



来源:https://stackoverflow.com/questions/46253537/computing-the-null-space-of-a-bigmatrix-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!