问题
I want to produce a new data frame from my existing one, where the columns in this new df are selected based on whether that variable is listed in a separate vector (i.e., as rows). The new df would therefore only contain those columns that were listed in the vector. I want to do this without having to manually indicate those columns, for efficiency's sake.
My intuition is that this is a pretty simple operation, but being very new to R I'm not exactly sure how to approach the problem.
Thanks!
回答1:
I just used this today (and in another answer on SO).
If you want to create a concatenated list:
matchingList<-c("a","b","b")
and you have a data frame df
with some of the same column names, then you can subset it like this:
newDF<- df[ ,which((names(df) %in% matchingList)==TRUE)]
If you were to read this left to right in english with instructions the code says:
- create a new data frame named
newDF
- Set newDF equal to the subset of all rows of the data frame
<-df[ ,
(rows live in space before the comma and after the bracket) - where the column names in df
which((names(df)
- when compared against the matching names that list
%in% matchingList)
- return a value of true
==TRUE)
It subsets only the fields that exist in both and returns a logical value of TRUE to satisfy the which statement that compares the two lists.
There are more brief ways, but this one allows you the change the df and matching list extensively and not have to retool the filter.
来源:https://stackoverflow.com/questions/41863722/r-filter-columns-in-a-data-frame-by-a-list