how to calculate the Euclidean norm of a vector in R?

后端 未结 10 1094
灰色年华
灰色年华 2021-02-01 13:22

I tried norm, but I think it gives the wrong result. (the norm of c(1, 2, 3) is sqrt(1*1+2*2+3*3), but it returns 6..

相关标签:
10条回答
  • 2021-02-01 14:05

    I was surprised that nobody had tried profiling the results for the above suggested methods, so I did that. I've used a random uniform function to generate a list and used that for repetition (Just a simple back of the envelop type of benchmark):

    > uut <- lapply(1:100000, function(x) {runif(1000, min=-10^10, max=10^10)})
    > norm_vec <- function(x) sqrt(sum(x^2))
    > norm_vec2 <- function(x){sqrt(crossprod(x))}
    > 
    > system.time(lapply(uut, norm_vec))
       user  system elapsed 
       0.58    0.00    0.58 
    > system.time(lapply(uut, norm_vec2))
       user  system elapsed 
       0.35    0.00    0.34 
    > system.time(lapply(uut, norm, type="2"))
       user  system elapsed 
       6.75    0.00    6.78 
    > system.time(lapply(lapply(uut, as.matrix), norm))
       user  system elapsed 
       2.70    0.00    2.73 
    

    It seems that taking the power and then sqrt manually is faster than the builtin norm for real values vectors at least. This is probably because norm internally does an SVD:

    > norm
    function (x, type = c("O", "I", "F", "M", "2")) 
    {
        if (identical("2", type)) {
            svd(x, nu = 0L, nv = 0L)$d[1L]
        }
        else .Internal(La_dlange(x, type))
    }
    

    and the SVD function internally converts the vector into a matrix, and does more complicated stuff:

    > svd
    function (x, nu = min(n, p), nv = min(n, p), LINPACK = FALSE) 
    {
        x <- as.matrix(x)
        ...
    

    EDIT (20 Oct 2019):

    There have been some comments to point out the correctness issue which the above test case doesn't bring out:

    > norm_vec(c(10^155))
    [1] Inf
    > norm(c(10^155), type="2")
    [1] 1e+155
    

    This happens because large numbers are considered as infinity in R:

    > 10^309
    [1] Inf
    

    So, it looks like:

    It seems that taking the power and then sqrt manually is faster than the builtin norm for real values vectors for small numbers.

    How small? So that the sum of squares doesn't overflow.

    0 讨论(0)
  • 2021-02-01 14:08

    Answer for Euclidean length of a vector (k-norm) with scaling to avoid destructive underflow and overflow is

    norm <- function(x, k) { max(abs(x))*(sum((abs(x)/max(abs(x)))^k))^(1/k) }
    

    See below for explanation.

    1. Euclidean length of a vector with no scaling:


    norm() is a vector-valued function which computes the length of the vector. It takes two arguments such as the vector x of class matrix and the type of norm k of class integer.

    norm <- function(x, k) {
      # x = matrix with column vector and with dimensions mx1 or mxn
      # k = type of norm with integer from 1 to +Inf
      stopifnot(k >= 1) # check for the integer value of k greater than 0
      stopifnot(length(k) == 1) # check for length of k to be 1. The variable k is not vectorized.
      if(k == Inf) {
        # infinity norm
        return(apply(x, 2, function(vec) max(abs(vec)) ))
      } else {
        # k-norm
        return(apply(x, 2, function(vec) (sum((abs(vec))^k))^(1/k) ))
      }
    }
    
    x <- matrix(c(1,-2,3,-4)) # column matrix
    sapply(c(1:4, Inf), function(k) norm(x = x, k = k))
    # [1] 10.000000  5.477226  4.641589  4.337613  4.000000
    
    • 1-norm (10.0) converges to infinity-norm (4.0).
    • k-norm is also called as "Euclidean norm in Euclidean n-dimensional space".

    Note: In the norm() function definition, for vectors with real components, the absolute values can be dropped in norm-2k or even indexed norms, where k >= 1.

    If you are confused with the norm function definition, you can read each one individually as given below.

    norm_1 <- function(x) sum(abs(x))
    norm_2 <- function(x) (sum((abs(x))^2))^(1/2)
    norm_3 <- function(x) (sum((abs(x))^3))^(1/3)
    norm_4 <- function(x) (sum((abs(x))^4))^(1/4)
    norm_k <- function(x) (sum((abs(x))^k))^(1/k)
    norm_inf <- max(abs(x))
    

    2. Euclidean length of a vector with scaling to avoid destructive overflow and underflow issues:


    Note-2: The only problem with this solution norm() is that it does not guard against overflow or underflow problems as alluded here and here.

    Fortunately, someone had already solved this problem for 2-norm (euclidean length) in the blas (basic linear algebra subroutines) fortran library. A description of this problem can be found in the textbook of "Numerical Methods and Software by Kahaner, Moler and Nash" - Chapter-1, Section 1.3, page - 7-9.

    The name of the fortran subroutine is dnrm2.f, which handles destructive overflow and underflow issues in the norm() by scaling with the maximum of the vector components. The destructive overflow and underflow problem arise due to radical operation in the norm() function.

    I will show how to implement dnrm2.f in R below.

    #1. find the maximum among components of vector-x
    max_x <- max(x)
    #2. scale or divide the components of vector by max_x
    scaled_x <- x/max_x
    #3. take square of the scaled vector-x
    sq_scaled_x <- (scaled_x)^2
    #4. sum the square of scaled vector-x
    sum_sq_scaled_x <- sum(sq_scaled_x)
    #5. take square root of sum_sq_scaled_x
    rt_sum_sq_scaled_x  <- sqrt(sum_sq_scaled_x)
    #6. multiply the maximum of vector x with rt_sum_sq_scaled_x
    max_x*rt_sum_sq_scaled_x
    

    one-liner of the above 6-steps of dnrm2.f in R is:

    # Euclidean length of vector - 2norm
    max(x)*sqrt(sum((x/max(x))^2))
    

    Lets try example vectors to compute 2-norm (see other solutions in this thread) for this problem.

    x = c(-8e+299, -6e+299, 5e+299, -8e+298, -5e+299)
    max(x)*sqrt(sum((x/max(x))^2))
    # [1] 1.227355e+300
    
    x <- (c(1,-2,3,-4))
    max(x)*sqrt(sum((x/max(x))^2))
    # [1] 5.477226
    

    Therefore, the recommended way to implement a generalized solution for k-norm in R is that single line, which guard against the destructive overflow or underflow problems. To improve this one-liner, you can use a combination of norm() without scaling for a vector containing not-too-small or not-too-large components and knorm() with scaling for a vector with too-small or too-large components. Implementing scaling for all vectors results in too many calculations. I did not implement this improvement in knorm() given below.

    # one-liner for k-norm - generalized form for all norms including infinity-norm:
    max(abs(x))*(sum((abs(x)/max(abs(x)))^k))^(1/k)
    
    # knorm() function using the above one-liner.
    knorm <- function(x, k) { 
      # x = matrix with column vector and with dimensions mx1 or mxn
      # k = type of norm with integer from 1 to +Inf
      stopifnot(k >= 1) # check for the integer value of k greater than 0
      stopifnot(length(k) == 1) # check for length of k to be 1. The variable k is not vectorized.
      # covert elements of matrix to its absolute values
      x <- abs(x)
      if(k == Inf) { # infinity-norm
        return(apply(x, 2, function(vec) max(vec)))
      } else { # k-norm
        return(apply(x, 2, function(vec) {
          max_vec <- max(vec)
          return(max_vec*(sum((vec/max_vec)^k))^(1/k))
        }))
      }
    }
    
    # 2-norm
    x <- matrix(c(-8e+299, -6e+299, 5e+299, -8e+298, -5e+299))
    sapply(2, function(k) knorm(x = x, k = k))
    # [1] 1.227355e+300
    
    # 1-norm, 2-norm, 3-norm, 4-norm, and infinity-norm
    sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
    # [1] 2.480000e+300 1.227355e+300 9.927854e+299 9.027789e+299 8.000000e+299
    
    x <- matrix(c(1,-2,3,-4))
    sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
    # [1] 10.000000  5.477226  4.641589  4.337613  4.000000
    
    x <- matrix(c(1,-2,3,-4, 0, -8e+299, -6e+299, 5e+299, -8e+298, -5e+299), nc = 2)
    sapply(c(1:4, Inf), function(k) knorm(x = x, k = k))
    #           [,1]          [,2]          [,3]          [,4]   [,5]
    # [1,]  1.00e+01  5.477226e+00  4.641589e+00  4.337613e+00  4e+00
    # [2,] 2.48e+300 1.227355e+300 9.927854e+299 9.027789e+299 8e+299
    
    0 讨论(0)
  • 2021-02-01 14:09
    norm(x, type = c("O", "I", "F", "M", "2"))
    

    The default is "O".

    "O", "o" or "1" specifies the one norm, (maximum absolute column sum);

    "F" or "f" specifies the Frobenius norm (the Euclidean norm of x treated as if it were a vector);

    norm(as.matrix(x1),"o")
    

    The result is 6, same as norm(as.matrix(x1))

    norm(as.matrix(x1),"f")
    

    The result is sqrt(1*1+2*2+3*3)

    So, norm(as.matrix(x1),"f") is answer.

    0 讨论(0)
  • 2021-02-01 14:20

    Create your matrix as column vise using cbind then the norm function works well with Frobenius norm (the Euclidean norm) as an argument.

    x1<-cbind(1:3)

    norm(x1,"f")

    [1] 3.741657

    sqrt(1*1+2*2+3*3)

    [1] 3.741657

    0 讨论(0)
提交回复
热议问题