问题
I'm using Google Sheets and this is way beyond my simple scripting.
I have numerous cells containing comma separated values;
AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB
BB, ZZ, ZZ, AA, BB, CC, BBB, CCC, CCCCC, AA, BBB, BB
I'm trying to return:
AA, BB, CC, BBB, CCC, CCCCC etc.
BB, ZZ, AA, CC, BBB, CCC, CCCCC etc.
... remove the duplicates. Per cell.
I can't get my head around a solution. I've tried every online tool that removes duplicates. BUT they all remove duplicates throughout my document.
Part of the problem is, I can't put the cells in 'alphabetical' order (which would make things simple) they have to be kept in the original order they appear.
I also have, at my disposal (but beyond my skill) Open Refine which I believe is a clever tool.
回答1:
Here is how to do that in OpenRefine.
The formula I used is :
value.split(',').uniques().join(',')
It means : split the value in the cells by commas, remove duplicates, join them again using commas.
EDIT :
Another solution in OpenRefine using Python instead of GREL. This one keep better the original order.
Python/Jython Script:
from collections import OrderedDict
dedup = list(OrderedDict.fromkeys(value.replace(' ','').split(',')))
return ",".join(dedup)
来源:https://stackoverflow.com/questions/50937289/removing-duplicate-strings-from-a-comma-separated-list-in-a-cell