Python: Why does the int class not have rich comparison operators like `__lt__()`?

放肆的年华 提交于 2019-12-04 22:29:38
Dave Webb

If we look at the PEP 207 for Rich Comparisions there is this interesting sentence right at the end:

The inlining already present which deals with integer comparisons would still apply, resulting in no performance cost for the most common cases.

So it seems that in 2.x there is an optimisation for integer comparison. If we take a look at the source code we can find this:

case COMPARE_OP:
    w = POP();
    v = TOP();
    if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) {
        /* INLINE: cmp(int, int) */
        register long a, b;
        register int res;
        a = PyInt_AS_LONG(v);
        b = PyInt_AS_LONG(w);
        switch (oparg) {
        case PyCmp_LT: res = a <  b; break;
        case PyCmp_LE: res = a <= b; break;
        case PyCmp_EQ: res = a == b; break;
        case PyCmp_NE: res = a != b; break;
        case PyCmp_GT: res = a >  b; break;
        case PyCmp_GE: res = a >= b; break;
        case PyCmp_IS: res = v == w; break;
        case PyCmp_IS_NOT: res = v != w; break;
        default: goto slow_compare;
        }
        x = res ? Py_True : Py_False;
        Py_INCREF(x);
    }
    else {
      slow_compare:
        x = cmp_outcome(oparg, v, w);
    }

So it seems that in 2.x there was an existing performance optimisation - by allowing the C code to compare integers directly - which would not have been preserved if the rich comparison operators had been implemented.

Now in Python 3 __cmp__ is no longer supported so the rich comparison operators must there. Now this does not cause a performance hit as far as I can tell. For example, compare:

Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit("2 < 1")
0.06980299949645996

to:

Python 3.2.3 (v3.2.3:3d0686d90f55, Apr 10 2012, 11:25:50) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit("2 < 1")
0.06682920455932617

So it seems that similar optimisations are there but my guess is the judgement call was that putting them all in the 2.x branch would have been too great a change when backwards compatibility was a consideration.

In 2.x if you want something like the rich comparison methods you can get at them via the operator module:

>>> import operator
>>> operator.gt(2,1)
True

__cmp__() is the old-fashioned way of doing comparisons, and is deprecated in favor of the rich operators (__lt__, __le__ etc.) which were only introduced in Python 2.1. Likely the transition was not complete as of 2.7.x -- whereas in Python 3.x __cmp__ is completely removed.

Haskell has the most elegant implementation I've seen -- to be an Ord (ordinal) data type, you just need to define how < and = works, and the typeclass itself supplies default implementations for <=, > and >= in terms of those two (which you're more than welcome to define yourself if you want). You can write such a class yourself in Python, not sure why that's not the default; probably performance reasons.

As hircus said, the __cmp__ style comparisons are deprecated in favor of the rich operators (__lt__, …) in Python 3. Originally, comparisons were implemented using __cmp__, but there are some types/situations where a simple __cmp__ operator isn't enough (e.g. instances of a Color class could support == and !=, but not < or >), so the rich comparison operators were added, leaving __cmp__ in place for backwards compatibility. Following the python philosophy of "There should be one-- and preferably only one --obvious way to do it,"1 the legacy support was removed in Python 3, when backwards compatibility could be sacrificed.

In Python 2, while int still uses __cmp__ so as not to break backwards compatibility, not all floating point numbers are less than, greater than, or equal to other floating point numbers (e.g. (float('nan') < 0.0, float('nan') == 0.0, float('nan') > 0.0) evaluates to (False, False, False), so what should float('nan').__cmp__(0.0) return?), so float needs to use the newer rich comparison operators.

1: Try typing "import this" into a python shell.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!