问题
There are some answers on stack to the below type of question, but they are all inefficient and do not scale well.
To reproduce it, suppose I have data that looks like this:
tempmat=matrix(c(1,1,0,4,1,0,0,4,0,1,0,4, 0,1,1,4, 0,1,0,5),5,4,byrow=T)
tempmat=rbind(rep(0,4),tempmat)
tempmat=data.table(tempmat)
names(tempmat)=paste0('prod1vint',1:4)
This is what the data look like, although it is MUCH bigger, so the solution cannot be an "apply" or row-wise based approach.
> tempmat
prod1vint1 prod1vint2 prod1vint3 prod1vint4
1: 0 0 0 0
2: 1 1 0 4
3: 1 0 0 4
4: 0 1 0 4
5: 0 1 1 4
6: 0 1 0 5
I want to identify the column of the first nonzero element, so the output would look like this:
> tempmat
prod1vint1 prod1vint2 prod1vint3 prod1vint4 firstnonzero
1: 0 0 0 0 NA
2: 1 1 0 4 1
3: 1 0 0 4 1
4: 0 1 0 4 2
5: 0 1 1 4 2
6: 0 1 0 5 2
回答1:
One option is to use rowSums
with max.col
specifying ties.method = "first"
temp <- tempmat != 0
(NA^(rowSums(temp) == 0)) * max.col(temp, ties.method = "first")
#[1] NA 1 1 2 2 2
max.col
would give column index of first maximum value in every row. However, this would return 1 in case all the values are 0 (like in 1st row) since 0 is the maximum value in the row. To avoid that we check if there is at least one non-zero value in the row using rowSums
and multiply it to max.col
output.
来源:https://stackoverflow.com/questions/55788404/efficiently-finding-first-nonzero-element-corresponding-column-of-a-data-table