There is no LU factorization function in R core. Although such factorization is a step of solve
, it is not made explicitly available as a stand-alone function. Can we write an R function for this? It needs mimic LAPACK routine dgetrf
. Matrix
package has an lu
function which is good, but it would be better if we could write a trackable R function, that can
- factorize the matrix till a certain column / row and return the intermediate result;
- continue the factorization from an intermediate result to another column / row or to the end.
This function would be useful for both educational and debugging purpose. The benefit for education is evident as we can illustrate the factorization / Gaussian elimination column by column. For debugging use, here are two examples.
In Inconsistent results between LU decomposition in R and Python, it is asked why LU factorization in R and Python gives different result. We can clearly see that both software return identical 1st pivot and 2nd pivot, but not the 3rd. So there must be something interesting when the factorization proceeds to the 3rd row / column. It would be good if we could retrieve that temporary result for an investigation.
In Can I stably invert a Vandermonde matrix with many small values in R? LU factorization is unstable for this type of matrix. In my answer, a 3 x 3 matrix is given for an example. I would expect solve
to produce an error complaining U[3, 3] = 0
, but running solve
for a few times I find that solve
is sometimes successful. So for a numerical investigation, I would like to know what happens when the factorization proceeds to the 2nd column / row.
Since the function is to be written in pure R code, it is expected to be slow for a moderate to big matrix. But performance is not an issue, as for education and debugging we only use a small matrix.
A little introduction to dgetrf
LAPACK's dgetrf computes LU factorization with row pivoting: A = PLU
. On exit of the factorization,
L
is a unit lower triangular matrix, stored in the lower triangular part ofA
;U
is an upper triangular matrix, stored in the upper triangular part ofA
;P
is a row permutation matrix stored as a separate permutation index vector.
Unless a pivot is exactly zero (not up to some tolerance), factorization should proceed.
What I start with
It is not challenging to write a LU factorization with neither row pivoting nor "pause / continue" option:
LU <- function (A) {
## check dimension
n <- dim(A)
if (n[1] != n[2]) stop("'A' must be a square matrix")
n <- n[1]
## Gaussian elimination
for (j in 1:(n - 1)) {
ind <- (j + 1):n
## check if the pivot is EXACTLY 0
piv <- A[j, j]
if (piv == 0) stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))
l <- A[ind, j] / piv
## update `L` factor
A[ind, j] <- l
## update `U` factor by Gaussian elimination
A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])
}
A
}
This is shown to give correct result when pivoting is not required:
A <- structure(c(0.923065107548609, 0.922819485189393, 0.277002309216186,
0.532856695353985, 0.481061384081841, 0.0952619954477996,
0.261916425777599, 0.433514681644738, 0.677919807843864,
0.771985625848174, 0.705952850636095, 0.873727774480358,
0.28782021952793, 0.863347264472395, 0.627262107795104,
0.187472499441355), .Dim = c(4L, 4L))
oo <- LU(A)
oo
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621 0.5756036
#[3,] 0.3000897 -0.3048058 0.53124291 0.7163376
#[4,] 0.5772688 -0.4040044 0.97970570 -0.4479307
L <- diag(4)
low <- lower.tri(L)
L[low] <- oo[low]
L
# [,1] [,2] [,3] [,4]
#[1,] 1.0000000 0.0000000 0.0000000 0
#[2,] 0.9997339 1.0000000 0.0000000 0
#[3,] 0.3000897 -0.3048058 1.0000000 0
#[4,] 0.5772688 -0.4040044 0.9797057 1
U <- oo
U[low] <- 0
U
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621 0.5756036
#[3,] 0.0000000 0.0000000 0.53124291 0.7163376
#[4,] 0.0000000 0.0000000 0.00000000 -0.4479307
Comparison with lu
from Matrix
package:
library(Matrix)
rr <- expand(lu(A))
rr
#$L
#4 x 4 Matrix of class "dtrMatrix" (unitriangular)
# [,1] [,2] [,3] [,4]
#[1,] 1.0000000 . . .
#[2,] 0.9997339 1.0000000 . .
#[3,] 0.3000897 -0.3048058 1.0000000 .
#[4,] 0.5772688 -0.4040044 0.9797057 1.0000000
#
#$U
#4 x 4 Matrix of class "dtrMatrix"
# [,1] [,2] [,3] [,4]
#[1,] 0.92306511 0.48106138 0.67791981 0.28782022
#[2,] . -0.38567138 0.09424621 0.57560363
#[3,] . . 0.53124291 0.71633755
#[4,] . . . -0.44793070
#
#$P
#4 x 4 sparse Matrix of class "pMatrix"
#
#[1,] | . . .
#[2,] . | . .
#[3,] . . | .
#[4,] . . . |
Now consider a permuted A
:
B <- A[c(4, 3, 1, 2), ]
LU(B)
# [,1] [,2] [,3] [,4]
#[1,] 0.5328567 0.43351468 0.8737278 0.1874725
#[2,] 0.5198439 0.03655646 0.2517508 0.5298057
#[3,] 1.7322952 -7.38348421 1.0231633 3.8748743
#[4,] 1.7318343 -17.93154011 3.6876940 -4.2504433
The result is different from LU(A)
. However, since Matrix::lu
performs row pivoting, the result of lu(B)
only differs from lu(A)
in the permutation matrix:
expand(lu(B))$P
#4 x 4 sparse Matrix of class "pMatrix"
#
#[1,] . . . |
#[2,] . . | .
#[3,] | . . .
#[4,] . | . .
Let's add those features one by one.
with row pivoting
This is not too difficult.
Suppose A
is n x n
. Initialize a permutation index vector pivot <- 1:n
. At the j
-th column we scan A[j:n, j]
for the maximum absolute value. Suppose it is A[m, j]
. If m > j
we do a row exchange A[m, ] <-> A[j, ]
. Meanwhile we do a permutation pivot[j] <-> pivot[m]
. After pivoting, the elimination is as same as that for a factorization without pivoting, so we could reuse the code of function LU
.
LUP <- function (A) {
## check dimension
n <- dim(A)
if (n[1] != n[2]) stop("'A' must be a square matrix")
n <- n[1]
## LU factorization from the beginning to the end
from <- 1
to <- (n - 1)
pivot <- 1:n
## Gaussian elimination
for (j in from:to) {
## select pivot
m <- which.max(abs(A[j:n, j]))
## A[j - 1 + m, j] is the pivot
if (m > 1L) {
## row exchange
tmp <- A[j, ]; A[j, ] <- A[j - 1 + m, ]; A[j - 1 + m, ] <- tmp
tmp <- pivot[j]; pivot[j] <- pivot[j - 1 + m]; pivot[j - 1 + m] <- tmp
}
ind <- (j + 1):n
## check if the pivot is EXACTLY 0
piv <- A[j, j]
if (piv == 0) {
stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))
}
l <- A[ind, j] / piv
## update `L` factor
A[ind, j] <- l
## update `U` factor by Gaussian elimination
A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])
}
## add `pivot` as an attribute and return `A`
structure(A, pivot = pivot)
}
Trying matrix B
in the question, LUP(B)
is as same as LU(A)
with an additional permutation index vector.
oo <- LUP(B)
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621 0.5756036
#[3,] 0.3000897 -0.3048058 0.53124291 0.7163376
#[4,] 0.5772688 -0.4040044 0.97970570 -0.4479307
#attr(,"pivot")
#[1] 3 4 2 1
Here is a utility function to extract L
, U
, P
:
exLUP <- function (LUPftr) {
L <- diag(1, nrow(LUPftr), ncol(LUPftr))
low <- lower.tri(L)
L[low] <- LUPftr[low]
U <- LUPftr[1:nrow(LUPftr), ] ## use "[" to drop attributes
U[low] <- 0
list(L = L, U = U, P = attr(LUPftr, "pivot"))
}
rr <- exLUP(oo)
#$L
# [,1] [,2] [,3] [,4]
#[1,] 1.0000000 0.0000000 0.0000000 0
#[2,] 0.9997339 1.0000000 0.0000000 0
#[3,] 0.3000897 -0.3048058 1.0000000 0
#[4,] 0.5772688 -0.4040044 0.9797057 1
#
#$U
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621 0.5756036
#[3,] 0.0000000 0.0000000 0.53124291 0.7163376
#[4,] 0.0000000 0.0000000 0.00000000 -0.4479307
#
#$P
#[1] 3 4 2 1
Note that the permutation index returned is really for PA = LU
(probably the most used in textbooks):
all.equal( B[rr$P, ], with(rr, L %*% U) )
#[1] TRUE
To get the permutation index as returned by LAPACK, i.e., the one in A = PLU
, do order(rr$P)
.
all.equal( B, with(rr, (L %*% U)[order(P), ]) )
#[1] TRUE
"pause / continue" option
Adding "pause / continue" feature is a bit tricky, as we need some way to record where an incomplete factorization stops so that we can pick it up from there later.
Suppose we are to enhance function LUP
to a new one LUP2
. Consider adding an argument to
. The factorization will stop when it has done with A[to, to]
and is going to work with A[to + 1, to + 1]
. We can store this to
, as well as the temporary pivot
vector, as attributes to A
and return. Later when we pass this temporary result back to LUP2
, it need first check whether these attributes exist. If so it knows where it should start; otherwise it just starts right from the beginning.
LUP2 <- function (A, to = NULL) {
## check dimension
n <- dim(A)
if (n[1] != n[2]) stop("'A' must be a square matrix")
n <- n[1]
## ensure that "to" has a valid value
## if it is not provided, set it to (n - 1) so that we complete factorization of `A`
## if provided, it can not be larger than (n - 1); otherwise it is reset to (n - 1)
if (is.null(to)) to <- n - 1L
else if (to > n - 1L) {
warning(sprintf("provided 'to' too big; reset to maximum possible value: %d", n - 1L))
to <- n - 1L
}
## is `A` an intermediate result of a previous, unfinished LU factorization?
## if YES, it should have a "to" attribute, telling where the previous factorization stopped
## if NO, a new factorization starting from `A[1, 1]` is performed
from <- attr(A, "to")
if (!is.null(from)) {
## so we continue factorization, but need to make sure there is work to do
from <- from + 1L
if (from >= n) {
warning("LU factorization of is already completed; return input as it is")
return(A)
}
if (from > to) {
stop(sprintf("please provide a bigger 'to' between %d and %d", from, n - 1L))
}
## extract "pivot"
pivot <- attr(A, "pivot")
} else {
## we start a new factorization
from <- 1
pivot <- 1:n
}
## LU factorization from `A[from, from]` to `A[to, to]`
## the following code reuses function `LUP`'s code
for (j in from:to) {
## select pivot
m <- which.max(abs(A[j:n, j]))
## A[j - 1 + m, j] is the pivot
if (m > 1L) {
## row exchange
tmp <- A[j, ]; A[j, ] <- A[j - 1 + m, ]; A[j - 1 + m, ] <- tmp
tmp <- pivot[j]; pivot[j] <- pivot[j - 1 + m]; pivot[j - 1 + m] <- tmp
}
ind <- (j + 1):n
## check if the pivot is EXACTLY 0
piv <- A[j, j]
if (piv == 0) {
stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))
}
l <- A[ind, j] / piv
## update `L` factor
A[ind, j] <- l
## update `U` factor by Gaussian elimination
A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])
}
## update attributes of `A` and return `A`
structure(A, to = to, pivot = pivot)
}
Try with matrix B
in the question. Let's say we want to stop the factorization after it has processed 2 columns / rows.
oo <- LUP2(B, 2)
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621 0.5756036
#[3,] 0.5772688 -0.4040044 0.52046170 0.2538693
#[4,] 0.3000897 -0.3048058 0.53124291 0.7163376
#attr(,"to")
#[1] 2
#attr(,"pivot")
#[1] 3 4 1 2
Since factorization is not complete, the U
factor is not an upper triangular. Here is a helper function to extract it.
## usable for all functions: `LU`, `LUP` and `LUP2`
## for `LUP2` the attribute "to" is used;
## for other two we can simply zero the lower triangular of `A`
getU <- function (A) {
attr(A, "pivot") <- NULL
to <- attr(A, "to")
if (is.null(to)) {
A[lower.tri(A)] <- 0
} else {
n <- nrow(A)
len <- (n - 1):(n - to)
zero_ind <- sequence(len)
offset <- seq.int(1L, by = n + 1L, length = to)
zero_ind <- zero_ind + rep.int(offset, len)
A[zero_ind] <- 0
}
A
}
getU(oo)
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621 0.5756036
#[3,] 0.0000000 0.0000000 0.52046170 0.2538693
#[4,] 0.0000000 0.0000000 0.53124291 0.7163376
#attr(,"to")
#[1] 2
Now we can continue factorization:
LUP2(oo, 1)
#Error in LUP2(oo, 1) : please provide a bigger 'to' between 3 and 3
Oops, we have wrongly passed an infeasible value to = 1
to LUP2
, because the temporary result has already processed 2 columns / rows and it can not undo it. The function tells us that we can only move forward and set to
to any integers between 3 and 3. If we pass in a value larger than 3, a warning will be generated and to
is reset to the maximum possible value.
oo <- LUP2(oo, 10)
#Warning message:
#In LUP2(oo, 10) :
# provided 'to' too big; reset to maximum possible value: 3
And we have the U
factor
getU(oo)
# [,1] [,2] [,3] [,4]
#[1,] 0.9230651 0.4810614 0.67791981 0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621 0.5756036
#[3,] 0.0000000 0.0000000 0.53124291 0.7163376
#[4,] 0.0000000 0.0000000 0.00000000 -0.4479307
#attr(,"to")
#[1] 3
The oo
is now a complete factorization result. What if we still ask LUP2
to update it?
## without providing "to", it defaults to factorize till the end
oo <- LUP2(oo)
#Warning message:
#In LUP2(oo) :
# LU factorization is already completed; return input as it is
It tells you that nothing further can be done and return the input as it is.
Finally let's try a singular square matrix.
## this 4 x 4 matrix has rank 1
S <- tcrossprod(1:4, 2:5)
LUP2(S)
#Error in LUP2(S) : system is exactly singular: U[2, 2] = 0
## traceback
LUP2(S, to = 1)
# [,1] [,2] [,3] [,4]
#[1,] 8.00 12 16 20
#[2,] 0.50 0 0 0
#[3,] 0.75 0 0 0
#[4,] 0.25 0 0 0
#attr(,"to")
#[1] 1
#attr(,"pivot")
#[1] 4 2 3 1
来源:https://stackoverflow.com/questions/51687808/write-a-trackable-r-function-that-mimics-lapacks-dgetrf-for-lu-factorization