How can I get 2.x-like sorting behaviour in Python 3.x?

后端未结

关注

 10  810

I\'m trying to replicate (and if possible improve on) Python 2.x\'s sorting behaviour in 3.x, so that mutually orderable types like int, float etc.

相关标签:

10条回答

甜味超标

2020-11-27 07:19
One way for Python 3.2+ is to use functools.cmp_to_key(). With this you can quickly implement a solution that tries to compare the values and then falls back on comparing the string representation of the types. You can also avoid an error being raised when comparing unordered types and leave the order as in the original case:
```
from functools import cmp_to_key

def cmp(a,b):
    try:
        return (a > b) - (a < b)
    except TypeError:
        s1, s2 = type(a).__name__, type(b).__name__
        return (s1 > s2) - (s1 < s2)
```
Examples (input lists taken from Martijn Pieters's answer):
```
sorted([0, 'one', 2.3, 'four', -5], key=cmp_to_key(cmp))
# [-5, 0, 2.3, 'four', 'one']
sorted([0, 123.4, 5, -6, 7.89], key=cmp_to_key(cmp))
# [-6, 0, 5, 7.89, 123.4]
sorted([{1:2}, {3:4}], key=cmp_to_key(cmp))
# [{1: 2}, {3: 4}]
sorted([{1:2}, None, {3:4}], key=cmp_to_key(cmp))
# [None, {1: 2}, {3: 4}]
```
This has the disadvantage that the three-way compare is always conducted, increasing the time complexity. However, the solution is low overhead, short, clean and I think cmp_to_key() was developed for this kind of Python 2 emulation use case.
0 讨论(0)
发布评论:

提交评论
- 加载中...

耶瑟儿～

2020-11-27 07:19

I'd like to recommend starting this sort of task (like imitation of another system's behaviour very close to this one) with detailed clarifying of the target system. How should it work with different corner cases. One of the best ways to do it - write a bunch of tests to ensure correct behaviour. Having such tests gives:

Better understandind which elements should precede which
Basic documenting
Makes system robust against some refactoring and adding functionality. For example if one more rule is added - how to get sure previous are not gets broken?

One can write such test cases:

sort2_test.py

import unittest
from sort2 import sorted2


class TestSortNumbers(unittest.TestCase):
    """
    Verifies numbers are get sorted correctly.
    """

    def test_sort_empty(self):
        self.assertEqual(sorted2([]), [])

    def test_sort_one_element_int(self):
        self.assertEqual(sorted2([1]), [1])

    def test_sort_one_element_real(self):
        self.assertEqual(sorted2([1.0]), [1.0])

    def test_ints(self):
        self.assertEqual(sorted2([1, 2]), [1, 2])

    def test_ints_reverse(self):
        self.assertEqual(sorted2([2, 1]), [1, 2])


class TestSortStrings(unittest.TestCase):
    """
    Verifies numbers are get sorted correctly.
    """

    def test_sort_one_element_str(self):
        self.assertEqual(sorted2(["1.0"]), ["1.0"])


class TestSortIntString(unittest.TestCase):
    """
    Verifies numbers and strings are get sorted correctly.
    """

    def test_string_after_int(self):
        self.assertEqual(sorted2([1, "1"]), [1, "1"])
        self.assertEqual(sorted2([0, "1"]), [0, "1"])
        self.assertEqual(sorted2([-1, "1"]), [-1, "1"])
        self.assertEqual(sorted2(["1", 1]), [1, "1"])
        self.assertEqual(sorted2(["0", 1]), [1, "0"])
        self.assertEqual(sorted2(["-1", 1]), [1, "-1"])


class TestSortIntDict(unittest.TestCase):
    """
    Verifies numbers and dict are get sorted correctly.
    """

    def test_string_after_int(self):
        self.assertEqual(sorted2([1, {1: 2}]), [1, {1: 2}])
        self.assertEqual(sorted2([0, {1: 2}]), [0, {1: 2}])
        self.assertEqual(sorted2([-1, {1: 2}]), [-1, {1: 2}])
        self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
        self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])
        self.assertEqual(sorted2([{1: 2}, 1]), [1, {1: 2}])

Next one may have such sorting function:

sort2.py

from numbers import Real
from decimal import Decimal
from itertools import tee, filterfalse


def sorted2(iterable):
    """

    :param iterable: An iterable (array or alike)
        entity which elements should be sorted.
    :return: List with sorted elements.
    """
    def predicate(x):
        return isinstance(x, (Real, Decimal))

    t1, t2 = tee(iterable)
    numbers = filter(predicate, t1)
    non_numbers = filterfalse(predicate, t2)
    sorted_numbers = sorted(numbers)
    sorted_non_numbers = sorted(non_numbers, key=str)
    return sorted_numbers + sorted_non_numbers

Usage is quite simple and is documented in tests:

>>> from sort2 import sorted2
>>> sorted2([1,2,3, "aaa", {3:5}, [1,2,34], {-8:15}])
[1, 2, 3, [1, 2, 34], 'aaa', {-8: 15}, {3: 5}]

0 讨论(0)

孤独总比滥情好

2020-11-27 07:20
Stupid idea: make a first pass to divide all the different items in groups that can be compared between each other, sort the individual groups and finally concatenate them. I assume that an item is comparable to all members of a group, if it is comparable with the first member of a group. Something like this (Python3):
```
import itertools

def python2sort(x):
    it = iter(x)
    groups = [[next(it)]]
    for item in it:
        for group in groups:
            try:
                item < group[0]  # exception if not comparable
                group.append(item)
                break
            except TypeError:
                continue
        else:  # did not break, make new group
            groups.append([item])
    print(groups)  # for debugging
    return itertools.chain.from_iterable(sorted(group) for group in groups)
```
This will have quadratic running time in the pathetic case that none of the items are comparable, but I guess the only way to know that for sure is to check all possible combinations. See the quadratic behavior as a deserved punishment for anyone trying to sort a long list of unsortable items, like complex numbers. In a more common case of a mix of some strings and some integers, the speed should be similar to the speed of a normal sort. Quick test:
```
In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j,  -5.5, 13 , 15.3, 'aa', 'zz']

In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]
```
It seems to be a 'stable sort' as well, since the groups are formed in the order the incomparable items are encountered.
0 讨论(0)
发布评论:

提交评论
- 加载中...

一向

2020-11-27 07:21

We can solve this problem in the following way.

Group by type.
Find which types are comparable by attempting to compare a single representative of each type.
Merge groups of comparable types.
Sort merged groups, if possible.
yield from (sorted) merged groups

We can get a deterministic and orderable key function from types by using repr(type(x)). Note that the 'type hierarchy' here is determined by the repr of the types themselves. A flaw in this method is that if two types have identical __repr__ (the types themselves, not the instances), you will 'confuse' types. This can be solved by using a key function that returns a tuple (repr(type), id(type)), but I have not implemented that in this solution.

The advantage of my method over Bas Swinkel's is a cleaner handling of a group of un-orderable elements. We do not have quadratic behavior; instead, the function gives up after the first attempted ordering during sorted()).

My method functions worst in the scenario where there are an extremely large number of different types in the iterable. This is a rare scenario, but I suppose it could come up.

def py2sort(iterable):
        by_type_repr = lambda x: repr(type(x))
        iterable = sorted(iterable, key = by_type_repr)
        types = {type_: list(group) for type_, group in groupby(iterable, by_type_repr)}

        def merge_compatible_types(types):
            representatives = [(type_, items[0]) for (type_, items) in types.items()]

            def mergable_types():
                for i, (type_0, elem_0) in enumerate(representatives, 1):
                    for type_1, elem_1 in representatives[i:]:
                         if _comparable(elem_0, elem_1):
                             yield type_0, type_1

            def merge_types(a, b):
                try:
                    types[a].extend(types[b])
                    del types[b]
                except KeyError:
                    pass # already merged

            for a, b in mergable_types():
                merge_types(a, b)
            return types

        def gen_from_sorted_comparable_groups(types):
            for _, items in types.items():
                try:
                    items = sorted(items)
                except TypeError:
                    pass #unorderable type
                yield from items
        types = merge_compatible_types(types)
        return list(gen_from_sorted_comparable_groups(types))

    def _comparable(x, y):
        try:
            x < y
        except TypeError:
            return False
        else:
            return True

    if __name__ == '__main__':    
        print('before py2sort:')
        test = [2, -11.6, 3, 5.0, (1, '5', 3), (object, object()), complex(2, 3), [list, tuple], Fraction(11, 2), '2', type, str, 'foo', object(), 'bar']    
        print(test)
        print('after py2sort:')
        print(py2sort(test))

0 讨论(0)

攒了一身酷

2020-11-27 07:24

Here is one method of accomplishing this:

lst = [0, 'one', 2.3, 'four', -5]
a=[x for x in lst if type(x) == type(1) or type(x) == type(1.1)] 
b=[y for y in lst if type(y) == type('string')]
a.sort()
b.sort()
c = a+b
print(c)

0 讨论(0)

时光说笑

2020-11-27 07:27

@martijn-pieters I don't know if list in python2 also has a __cmp__ to handle comparing list objects or how it was handled in python2.

Anyway, in addition to the @martijn-pieters's answer, I used the following list comparator, so at least it doesn't give different sorted output based on different order of elements in the same input set.

@per_type_cmp(list) def list_cmp(a, b): for a_item, b_item in zip(a, b): if a_item == b_item: continue return python2_sort_key(a_item) < python2_sort_key(b_item) return len(a) < len(b)

So, joining it with original answer by Martijn:

from numbers import Number


# decorator for type to function mapping special cases
def per_type_cmp(type_):
    try:
        mapping = per_type_cmp.mapping
    except AttributeError:
        mapping = per_type_cmp.mapping = {}
    def decorator(cmpfunc):
        mapping[type_] = cmpfunc
        return cmpfunc
    return decorator


class python2_sort_key(object):
    _unhandled_types = {complex}

    def __init__(self, ob):
       self._ob = ob

    def __lt__(self, other):
        _unhandled_types = self._unhandled_types
        self, other = self._ob, other._ob  # we don't care about the wrapper

        # default_3way_compare is used only if direct comparison failed
        try:
            return self < other
        except TypeError:
            pass

        # hooks to implement special casing for types, dict in Py2 has
        # a dedicated __cmp__ method that is gone in Py3 for example.
        for type_, special_cmp in per_type_cmp.mapping.items():
            if isinstance(self, type_) and isinstance(other, type_):
                return special_cmp(self, other)

        # explicitly raise again for types that won't sort in Python 2 either
        if type(self) in _unhandled_types:
            raise TypeError('no ordering relation is defined for {}'.format(
                type(self).__name__))
        if type(other) in _unhandled_types:
            raise TypeError('no ordering relation is defined for {}'.format(
                type(other).__name__))

        # default_3way_compare from Python 2 as Python code
        # same type but no ordering defined, go by id
        if type(self) is type(other):
            return id(self) < id(other)

        # None always comes first
        if self is None:
            return True
        if other is None:
            return False

        # Sort by typename, but numbers are sorted before other types
        self_tname = '' if isinstance(self, Number) else type(self).__name__
        other_tname = '' if isinstance(other, Number) else type(other).__name__

        if self_tname != other_tname:
            return self_tname < other_tname

        # same typename, or both numbers, but different type objects, order
        # by the id of the type object
        return id(type(self)) < id(type(other))


@per_type_cmp(dict)
def dict_cmp(a, b, _s=object()):
    if len(a) != len(b):
        return len(a) < len(b)
    adiff = min((k for k in a if a[k] != b.get(k, _s)), key=python2_sort_key, default=_s)
    if adiff is _s:
        # All keys in a have a matching value in b, so the dicts are equal
        return False
    bdiff = min((k for k in b if b[k] != a.get(k, _s)), key=python2_sort_key)
    if adiff != bdiff:
        return python2_sort_key(adiff) < python2_sort_key(bdiff)
    return python2_sort_key(a[adiff]) < python2_sort_key(b[bdiff])

@per_type_cmp(list)
def list_cmp(a, b):
    for a_item, b_item in zip(a, b):
        if a_item == b_item:
            continue
        return python2_sort_key(a_item) < python2_sort_key(b_item)
    return len(a) < len(b)

PS: It makes more sense to create it as an comment but I didn't have enough reputation to make a comment. So, I'm creating it as an answer instead.

0 讨论(0)

1 2 下一页