Does Python have an ordered set?

前端 未结 14 1336
予麋鹿
予麋鹿 2020-11-21 13:20

Python has an ordered dictionary. What about an ordered set?

相关标签:
14条回答
  • 2020-11-21 13:45

    I can do you one better than an OrderedSet: boltons has a pure-Python, 2/3-compatible IndexedSet type that is not only an ordered set, but also supports indexing (as with lists).

    Simply pip install boltons (or copy setutils.py into your codebase), import the IndexedSet and:

    >>> from boltons.setutils import IndexedSet
    >>> x = IndexedSet(list(range(4)) + list(range(8)))
    >>> x
    IndexedSet([0, 1, 2, 3, 4, 5, 6, 7])
    >>> x - set(range(2))
    IndexedSet([2, 3, 4, 5, 6, 7])
    >>> x[-1]
    7
    >>> fcr = IndexedSet('freecreditreport.com')
    >>> ''.join(fcr[:fcr.index('.')])
    'frecditpo'
    

    Everything is unique and retained in order. Full disclosure: I wrote the IndexedSet, but that also means you can bug me if there are any issues. :)

    0 讨论(0)
  • 2020-11-21 13:52

    Implementations on PyPI

    While others have pointed out that there is no built-in implementation of an insertion-order preserving set in Python (yet), I am feeling that this question is missing an answer which states what there is to be found on PyPI.

    There are the packages:

    • ordered-set (Python based)
    • orderedset (Cython based)
    • collections-extended
    • boltons (under iterutils.IndexedSet, Python-based)
    • oset (last updated in 2012)

    Some of these implementations are based on the recipe posted by Raymond Hettinger to ActiveState which is also mentioned in other answers here.

    Some differences

    • ordered-set (version 1.1)
    • advantage: O(1) for lookups by index (e.g. my_set[5])
    • oset (version 0.1.3)
    • advantage: O(1) for remove(item)
    • disadvantage: apparently O(n) for lookups by index

    Both implementations have O(1) for add(item) and __contains__(item) (item in my_set).

    0 讨论(0)
  • 2020-11-21 13:52

    In case you're already using pandas in your code, its Index object behaves pretty like an ordered set, as shown in this article.

    Examples from the article:

    indA = pd.Index([1, 3, 5, 7, 9])
    indB = pd.Index([2, 3, 5, 7, 11])
    
    indA & indB  # intersection
    indA | indB  # union
    indA - indB  # difference
    indA ^ indB  # symmetric difference
    
    0 讨论(0)
  • 2020-11-21 13:52

    The ParallelRegression package provides a setList( ) ordered set class that is more method-complete than the options based on the ActiveState recipe. It supports all methods available for lists and most if not all methods available for sets.

    0 讨论(0)
  • 2020-11-21 13:54

    There's no OrderedSet in official library. I make an exhaustive cheatsheet of all the data structure for your reference.

    DataStructure = {
        'Collections': {
            'Map': [
                ('dict', 'OrderDict', 'defaultdict'),
                ('chainmap', 'types.MappingProxyType')
            ],
            'Set': [('set', 'frozenset'), {'multiset': 'collection.Counter'}]
        },
        'Sequence': {
            'Basic': ['list', 'tuple', 'iterator']
        },
        'Algorithm': {
            'Priority': ['heapq', 'queue.PriorityQueue'],
            'Queue': ['queue.Queue', 'multiprocessing.Queue'],
            'Stack': ['collection.deque', 'queue.LifeQueue']
            },
        'text_sequence': ['str', 'byte', 'bytearray']
    }
    
    0 讨论(0)
  • 2020-11-21 13:55

    As other answers mention, as for python 3.7+, the dict is ordered by definition. Instead of subclassing OrderedDict we can subclass abc.collections.MutableSet or typing.MutableSet using the dict's keys to store our values.

    class OrderedSet(typing.MutableSet[T]):
        """A set that preserves insertion order by internally using a dict."""
    
        def __init__(self, iterable: t.Iterator[T]):
            self._d = dict.fromkeys(iterable)
    
        def add(self, x: T) -> None:
            self._d[x] = None
    
        def discard(self, x: T) -> None:
            self._d.pop(x)
    
        def __contains__(self, x: object) -> bool:
            return self._d.__contains__(x)
    
        def __len__(self) -> int:
            return self._d.__len__()
    
        def __iter__(self) -> t.Iterator[T]:
            return self._d.__iter__()
    

    Then just:

    x = OrderedSet([1, 2, -1, "bar"])
    x.add(0)
    assert list(x) == [1, 2, -1, "bar", 0]
    

    I put this code in a small library, so anyone can just pip install it.

    0 讨论(0)
提交回复
热议问题