How to remove columns and rows that sum to 0 while preserving non-numeric columns

你离开我真会死。 提交于 2020-04-10 04:58:30

问题


Below is a subset of my data. I am trying to remove columns AND rows that sum to 0 ... the catch is that I want to preserve columns 1 to 8 in the resulting output. Any ideas? I've tried quite a few. A tidy solution would be best.

Site    Date    Mon Day Yr          Szn SznYr       A   B   C   D   E   F   G
B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   0   0   0   0
B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   1   0   0   0   0
B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   3   0   0   0
B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   0   0   0   10
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   5   0   0
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   6   0
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   8
B0002   6/28/07 6   28  2007    Summer  2007-Summer 0   3   6   1   7   0   1

回答1:


try this:

# remove rows 
df <- df[rowSums(df[-(1:7)]) !=0, ]
# remove columns    
df <- df[c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
#     Site    Date Mon Day   Yr    Szn       SznYr B C D E F  G
# 2  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 1 0 0 0  0
# 3  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 0 3 0 0  0
# 4  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 0 0 0 0 10
# 5  B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 5 0  0
# 7  B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 0 6  0
# 10 B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 0 0  8
# 11 B0002 6/28/07   6  28 2007 Summer 2007-Summer 3 6 1 7 0  1

You can do this in one step to get the same output as @dan-y (the same in this specific case, but different if you have negative values in your real data) :

    df <- df[rowSums(df[-(1:7)]) !=0,
             c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]



回答2:


This isn't fancy, but it's explicit and easily modifiable:

# generate example data
df <- data.frame(
    site = c(rep("B1", 4), rep("B2", 7)),
    szn  = rep("Summar", 11),
    A= c(0,0,0,0,0,0,0,0,0,0,0),
    B= c(0,0,0,0,0,0,0,0,0,0,3),
    C= c(0,1,0,0,0,0,0,0,0,0,6),
    D= c(0,0,3,0,0,0,0,0,0,0,1),
    E= c(0,0,0,0,5,0,0,0,0,0,7),
    F= c(0,0,0,0,0,0,6,0,0,0,0),
    G= c(0,0,10,0,0,0,0,0,0,8,1),
    stringsAsFactors = FALSE
)

# get names of cols you want to check for 0s
other_cols <- names(df)[1:2]
num_cols   <- names(df)[3:9]

# check rowsum and colsum
rows_to_keep <- rowSums(df[ , num_cols]) != 0
cols_to_keep <- colSums(df[ , num_cols]) != 0

# keep (1) rows that don't sum to zero 
#      (2) numeric cols that don't sum to zero, and
#      (3) the "other" cols that are non-numeric
df[rows_to_keep , c(other_cols, num_cols[cols_to_keep])]


来源:https://stackoverflow.com/questions/51825436/how-to-remove-columns-and-rows-that-sum-to-0-while-preserving-non-numeric-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!