问题
Below is a subset of my data. I am trying to remove columns AND rows that sum to 0 ... the catch is that I want to preserve columns 1 to 8 in the resulting output. Any ideas? I've tried quite a few. A tidy solution would be best.
Site Date Mon Day Yr Szn SznYr A B C D E F G
B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0 0 0 0
B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 1 0 0 0 0
B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 3 0 0 0
B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0 0 0 10
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 5 0 0
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 0 0
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 6 0
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 0 0
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 0 0
B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 0 8
B0002 6/28/07 6 28 2007 Summer 2007-Summer 0 3 6 1 7 0 1
回答1:
try this:
# remove rows
df <- df[rowSums(df[-(1:7)]) !=0, ]
# remove columns
df <- df[c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
# Site Date Mon Day Yr Szn SznYr B C D E F G
# 2 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 1 0 0 0 0
# 3 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 3 0 0 0
# 4 B0001 7/29/97 7 29 1997 Summer 1997-Summer 0 0 0 0 0 10
# 5 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 5 0 0
# 7 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 6 0
# 10 B0002 7/28/97 7 28 1997 Summer 1997-Summer 0 0 0 0 0 8
# 11 B0002 6/28/07 6 28 2007 Summer 2007-Summer 3 6 1 7 0 1
You can do this in one step to get the same output as @dan-y (the same in this specific case, but different if you have negative values in your real data) :
df <- df[rowSums(df[-(1:7)]) !=0,
c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
回答2:
This isn't fancy, but it's explicit and easily modifiable:
# generate example data
df <- data.frame(
site = c(rep("B1", 4), rep("B2", 7)),
szn = rep("Summar", 11),
A= c(0,0,0,0,0,0,0,0,0,0,0),
B= c(0,0,0,0,0,0,0,0,0,0,3),
C= c(0,1,0,0,0,0,0,0,0,0,6),
D= c(0,0,3,0,0,0,0,0,0,0,1),
E= c(0,0,0,0,5,0,0,0,0,0,7),
F= c(0,0,0,0,0,0,6,0,0,0,0),
G= c(0,0,10,0,0,0,0,0,0,8,1),
stringsAsFactors = FALSE
)
# get names of cols you want to check for 0s
other_cols <- names(df)[1:2]
num_cols <- names(df)[3:9]
# check rowsum and colsum
rows_to_keep <- rowSums(df[ , num_cols]) != 0
cols_to_keep <- colSums(df[ , num_cols]) != 0
# keep (1) rows that don't sum to zero
# (2) numeric cols that don't sum to zero, and
# (3) the "other" cols that are non-numeric
df[rows_to_keep , c(other_cols, num_cols[cols_to_keep])]
来源:https://stackoverflow.com/questions/51825436/how-to-remove-columns-and-rows-that-sum-to-0-while-preserving-non-numeric-column