问题
I have a data-frame with 2 columns ID and Product as below :
ID Product
A Clothing, Clothing Food, Furniture, Furniture
B Food,Food,Food, Clothing
C Food, Clothing, Clothing
I need to have only unique products for each ID, for example :
ID Product
A Clothing, Food, Furniture
B Food, Clothing
C Food, Clothing
How do I do this using R
回答1:
If there are multiple delimiters in the dataset, one way would be to split the 'Product' column using all the delimiters, get the unique
and then paste
it together (toString
) grouped by 'ID'. Here we use data.table
methods.
library(data.table)
setDT(df1)[, list(Product= toString(unique(strsplit(Product,
',\\s*|\\s+')[[1]]))), by = ID]
# ID Product
#1: A Clothing, Food, Furniture
#2: B Food, Clothing
#3: C Food, Clothing
来源:https://stackoverflow.com/questions/35286596/how-to-remove-duplicate-comma-separated-character-values-from-each-cell-of-a-col