r matrix individual shift operations of elements

佐手、 提交于 2019-12-14 03:28:48

问题


I am trying to optimize some code that I have written as it is very slow for large datasets. I am not sure if the following can be done with matrix operations and I would appreciate if someone had any suggestions to make it faster.

I have a matrix with zeros and integers and I would like to shift down the entries of the individual columns by the absolute number of the integer in the the entry.

   [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0   -4    0
[3,]    4    0    0
[4,]   -3   -2    0
[5,]    0    2   -1
[6,]    2   -2    0
[7,]    0    0    0
[8,]   -3   -3    0  

The code I am using is the following:

#data
A<-matrix(data=c(0,0,4,-3,0,2,0,-3,0,-4,0,-2,2,-2,0,-3,0,0,0,0,-1,0,0,0),nrow=8,ncol=3)

#shift function
shift<-function(x)
{
  #create the output matrix
  out<-matrix(data=0,nrow=8,ncol=1)

  #for loop to create the shift matrix
  for(i in seq(1,8,by=1))
  {
    if(i+abs(x[i])<=8)
    {
      #find the non zero
      if(x[i]!=0)
      {
        #if there is already a number put zero  
        if(out[i+abs(x[i]),1]!=0)
        {
          out[i+abs(x[i]),1]=0
        } else {
          #shift
          out[i+abs(x[i]),1]=x[i]
        }
      }
    }
  }

  #return object
  return(out)
}

#run the logic
shift_mat<-sapply(1:ncol(A),FUN=function(k) shift(A[,k]))

and the result is:

   [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0
[3,]    0    0    0
[4,]    0    0    0
[5,]    0    0    0
[6,]    0    0   -1
[7,]    0    2    0
[8,]    2   -2    0

The rules are the following for every column:

  1. starting from the top find first entry that is different than zero
  2. shift down by the absolute numbers of that entry
  3. if there is another entry at the targeted point put zero
  4. repeat for the next column

Thanks,

Nikos


回答1:


This is a bit cleaner and about 40% faster using your example on my machine. Maybe the speed improvement will be greater using your larger data?

You should use a matrix of integers. It uses less memory and some operations are faster:

A <- matrix(as.integer(c(0,0,4,-3,0,2,0,-3,0,-4,0,-2,2,
                        -2,0,-3,0,0,0,0,-1,0,0,0)), nrow = 8, ncol = 3)

Each column is a vector, so should be your output. I replaced matrices with vectors. Also made your code more robust without the hardcoded number of rows:

shift <- function(x) {
  n <- length(x)
  y <- rep(0L, n)
  for(i in seq_len(n)) {
    if (x[i] == 0L) next
    j <- i + abs(x[i])
    if (j > n) next
    y[j] <- if (y[j] != 0L) 0L else x[i]
  }
  return(y)
}

You can run it using apply:

shift_mat <- apply(A, 2, shift)



回答2:


The shift operation can be vectorized. Let's just take the first column of your data to see how:

v = c(0,0,4,-3,0,2,0,-3)

# index of the elements that could be non-zero in the final result
index = ifelse (v != 0 & abs(v) + seq_along(v) <= length(v),
                abs(v) + seq_along(v), 0)
# [1] 0 0 7 7 0 8 0 0


# now just need to filter out the duplicated entries
index = ave(index, index, FUN = function(x) {if (length(x) > 1) 0 else x})
# [1] 0 0 0 0 0 8 0 0

# home at last
res = integer(length(v))
res[index] = v[which(index != 0)]
res
# [1] 0 0 0 0 0 0 0 2

You can then put then above into a function and then lapply over your data.frame or apply on the columns of your matrix.

Unsurprisingly the biggest bottleneck above is the ave function, and you can replace that line with the following data.table construct (don't forget to require(data.table) somewhere) to speed it up considerably:

index = data.table(index)[, index := if(.N > 1) 0 else index, by = index][, index]


来源:https://stackoverflow.com/questions/18281936/r-matrix-individual-shift-operations-of-elements

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!