Options to remove duplicates may include the following generic data structures:
- set: unordered, unique elements
- ordered set: ordered, unique elements
Here is a summary on quickly getting either one in Python.
Given
from collections import OrderedDict
seq = [u"nowplaying", u"PBS", u"PBS", u"nowplaying", u"job", u"debate", u"thenandnow"]
Code
Option 1 - A set (unordered):
list(set(seq))
# ['thenandnow', 'PBS', 'debate', 'job', 'nowplaying']
Option 2 - Python doesn't have ordered sets, but here are some ways to mimic one (insertion ordered):
list(OrderedDict.fromkeys(seq))
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']
list(dict.fromkeys(seq)) # py36
# ['nowplaying', 'PBS', 'job', 'debate', 'thenandnow']
The last option is recommended if using Python 3.6+. See more details in this post.
Note: listed elements must be hashable. See details on the latter example in this blog post. Furthermore, see R. Hettinger's post on the same technique; the order preserving dict is extended from one of his early implementations. See also more on total ordering.