Given a list of strings I want to remove the duplicates and original word.
For example:
lst = [\'a\', \'b\', \'c\', \'c\', \'c\', \'d\', \'e\', \'e\']
Use a collections.Counter() object, then keep only those values with a count of 1:
from collections import counter
[k for k, v in Counter(lst).items() if v == 1]
This is a O(N) algorithm; you just need to loop through the list of N items once, then a second loop over fewer items (< N) to extract those values that appear just once.
If order is important and you are using Python < 3.6, separate the steps:
counts = Counter(lst)
[k for k in lst if counts[k] == 1]
Demo:
>>> from collections import Counter
>>> lst = ['a', 'b', 'c', 'c', 'c', 'd', 'e', 'e']
>>> [k for k, v in Counter(lst).items() if v == 1]
['a', 'b', 'd']
>>> counts = Counter(lst)
>>> [k for k in lst if counts[k] == 1]
['a', 'b', 'd']
That the order is the same for both approaches is a coincidence; for Python versions before Python 3.6, other inputs may result in a different order.
In Python 3.6 the implementation for dictionaries changed and input order is now retained.
@Padraic:
If your list is:
lst = ['a', 'b', 'c', 'c', 'c', 'd', 'e', 'e']
then
list(set(lst))
would return the following:
['a', 'c', 'b', 'e', 'd']
which is not the thing adhankar wants..
Filtering all duplicates completely can be easily done with a list comprehension:
[item for item in lst if lst.count(item) == 1]
The output of this would be:
['a', 'b', 'd']
item stands for every item in the list lst, but it is only appended to the new list if lst.count(item) equals 1, which ensures, that the item only exists once in the original list lst.
Look up List Comprehension for more information: Python list comprehension documentation
lst = ['a', 'b', 'c', 'c', 'c', 'd', 'e', 'e']
from collections import Counter
c = Counter(lst)
print([k for k,v in c.items() if v == 1 ])
collections.Counter will count the occurrences of each element, we keep the elements whose count/value is == 1
with if v == 1
t = ['a', 'b', 'c', 'c', 'c', 'd', 'e', 'e']
print [a for a in t if t.count(a) == 1]
You could make a secondary empty list and only append items that aren't already in it.
oldList = ['a', 'b', 'c', 'c', 'c', 'd', 'e', 'e']
newList = []
for item in oldList:
if item not in newList:
newList.append(item)
print newList
I don't have an interpreter with me, but the logic seems sound.