In Python, why is a module implemented in C faster than a pure Python module, and how do I write one?

前端 未结 4 1371
野性不改
野性不改 2021-02-14 09:54

The python documentation states, that the reason cPickle is faster than Pickle is, that the former is implemented in C. What does that mean exactly?

I am making a module

相关标签:
4条回答
  • 2021-02-14 10:02

    When you write a function in python, a new function object is created, the function code is parsed and bytecompiled[and saved in the "func_code" attribute], so when you call that function the interpreter reads its bytecode and executes it.

    If you write the same function in C, following C/Python API to make it avaiable in python, the interpreter will create the function object, but this function won't have a bytecode. When the interpreter finds a call to that function it calls the real C function, thus it executes at "machine" speed and not at "python-machine" speed.

    You can verify this checking functions written in C:

    >>> map.func_code
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'builtin_function_or_method' object has no attribute 'func_code'
    >>> def mymap():pass
    ... 
    >>> mymap.func_code
    <code object mymap at 0xcfb5b0, file "<stdin>", line 1>
    

    To understand how you can write C code for python use follow the guides in the official site.

    Anyway, if you are simply doing N-dimensional array calculations numpy ought to be sufficient.

    0 讨论(0)
  • 2021-02-14 10:12

    As mentioned, numpy is excellent for vector computations. (Could be better still, but the comment that it's better than anything you could write without actually doing work is definitely true.)

    Not everything can be easily vectorized, though, so if you do have tight inner loops with lots of function calls (say a heavily recursive algorithm) you still have a couple of options: probably the most popular is Cython, which allows you to write modules and functions in a kind of annotated Python and get C-like speed when you need it.

    Or maybe your time is all dominated by library calls to compute eigenvalues or invert matrices or evaluate special functions or divide really large integers -- many of which the Sage project handles very well, by the way, if what you're doing is more mathematical than pure crunching -- in which case the time spent in Python might not even matter. It all depends on the details of the kind of numerics you're doing.

    0 讨论(0)
  • 2021-02-14 10:14

    Besides Pyrex/Cython, already mentioned, you have other alternatives:

    Shed Skin: Translates (a restricted subset of) Python to C++. Can automatically generate an extension for you. You'd create an extension doing this (assuming Linux):

    wget http://shedskin.googlecode.com/files/shedskin-0.7.tgz
    tar -xzf shedskin-0.7.tgz
    # On your code folder:
    PYTHONPATH=/path/to/shedskin-0.7 python shedskin -e yourmodule.py
    # The above generates a Makefile and a yourmodule.h/.cpp pair
    make
    # Now you can "import yourmodule" from Python and check it's from the .so by "print yourmodule.__file__
    

    PyPy: A faster Python, with a JIT compiler. You could simply run your code on it instead of CPython. Only supports Python 2.5 now, 2.7 support soon. Can give huge speedups on math-heavy code. To install and run it (assuming Linux 32-bit):

    wget http://pypy.org/download/pypy-1.4.1-linux.tar.bz2
    tar -xjf pypy-1.4.1-linux.tar.bz2
    sudo ln -s /path/to/pypy-1.4.1-linux/bin/pypy /usr/local/bin
    # Then, instead of "python yourprogram.py" you'll just run "pypy yourprogram.py"
    

    Weave: Allows you to write C inline, the compiles it.

    Edit: If you want us to run these tools for you and benchmark, just post your code ;)

    0 讨论(0)
  • 2021-02-14 10:18

    You can write fast C code and then use it in your python scripts, so your program will run faster.[1] http://docs.python.org/extending/index.html#extending-index

    An example is Numpy, written in C ( https://numpy.org/ )

    Typical use is to implement the bottleneck in C (or to use a library written in C, of course ;) ), due to its speed, and to use python for the remaining code

    [1] by the way, this is why cPickle is faster than pickle

    edit:

    take a look at Pyrex: http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html

    'Pyrex is a language specially designed for writing Python extension modules. It's designed to bridge the gap between the nice, high-level, easy-to-use world of Python and the messy, low-level world of C. '

    It's not the 'official' way but it may be useful

    0 讨论(0)
提交回复
热议问题