原文地址: http://www.behnel.de/cython200910/talk.html以下为原文
About myself
Passionate Python developer since 2002
after Basic, Logo, Pascal, Prolog, Scheme, Java, C, ...
CS studies in Germany, Ireland, France
PhD in distributed systems in 2007
Language design for self-organising systems
Darmstadt University of Technologies, Germany
Current occupations:
IT transformations, SOA design, Java-Development, ...
Employed by Senacor Technologies AG, Germany
»lxml« OpenSource XML toolkit for Python
»Cython«
Part 1: Intro to Cython
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
What is Cython?
Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.
What is Cython?
Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.
What is Cython?
Cython is
an Open-Source project
a Python compiler (almost)
an enhanced, optimising fork of Pyrex
an extended Python language for
writing fast Python extension modules
interfacing Python with C libraries
Major Cython Core Developers
Robert Bradshaw, Stefan Behnel, Dag Sverre Seljebotn
lead developers
Lisandro Dalcín
C/C++ portability and various feature patches
Kurt Smith, Danilo Freitas
Google Summer of Code 2009: Fortran/C++ integration
Greg Ewing
main developer and maintainer of Pyrex
many, many others - see
the mailing list archives of Cython and Pyrex
How to use Cython
you write Python code
Cython translates it into C code
your C compiler builds a shared library for CPython
you import your module into CPython
Cython has support for
optionally compile Python code from setup.py!
Cython does that for its own modules :-)
distutils
embedding the CPython runtime in an executable
Example: compiling Python code
# file: worker.pyclass HardWorker(object): u"Almost Sisyphos" def __init__(self, task): self.task = task def work_hard(self, repeat=100): for i in range(repeat): self.task()def add_simple_stuff():
x = 1+1HardWorker(add_simple_stuff).work_hard()
Example: compiling Python code
compile with
$ cython worker.py
translates to ~1500 line .c file (Cython 0.11.3)
helps tracing your own code in generated sources
different C compilers, Python versions, ...
lots of portability #define's
tons of helpful C comments with Python code snippets
a lot of code that you don't want to write yourself
Portable Code
Cython compiler generates C code that compiles
with all major compilers (C and C++)
on all major platforms
in Python 2.3 through 3.1
Cython language syntax follows Python 2.6
get involved to get it quicker!
optional Python 3 syntax support is on TODO list
... the fastest way to port Python 2 code to Py3 ;-)
Python language feature support
most of Python 2 syntax is supported
top-level classes and functions
control structures: loops, with, try-except/finally, ...
object operations, arithmetic, ...
plus many Py3 features:
list/set/dict comprehensions
keyword-only arguments
extended iterable unpacking (a,b,*c,d = some_list)
Python features in work
Inner functions with closures
def factory(a,b): def closure_function(c): return a+b+c return closure_function
status: (hopefully) to be merged for 0.12
Planned Cython features
improved C++ integration (GSoC 2009)
e.g. function/operator overloading support
status: mostly there, to be finished and integrated
improved Fortran integration (GSoC 2009)
talking to Fortan code directly
status: mostly there, to be finished and integrated
native array data type with SIMD behaviour
status: large interest, implementation pending
... as usual: great ideas, little time
Currently unsupported
local/inner classes (~open)
lambda expressions (~easy)
generators (~needs work)
generator expressions (~easy)
with obvious optimisations, e.g.
set( x.a for x in some_list )== { x.a for x in some_list }
... all certainly on the TODO list for 1.0.
Speed
Cython generates very efficient C code:
PyBench: most benchmarks run 20-80% faster
conditions and loops run 5-8x faster than in Py2.6.2
overall about 30% faster for plain Python benchmark
obviously, real applications are different
PyPy's richards.py benchmark:
heavily class based scheduler
20% faster than CPython 2.6.2
Type declarations
Cython supports optional type declarations that
can be employed exactly where performance matters
let Cython generate plain C instead of C-API calls
make richards.py benchmark 5x faster than CPython
without Python code modifications :)
can make code 100 - 1000x faster than CPython
expect several 100 times in calculation loops
Part 2: Building Cython modules
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
Ways to build Cython code
To compile Python code (.py) or Cython code (.pyx)
you need:
Cython, Python and a C compiler
you can use:
web app that supports writing and running Cython code
on-the-fly build + import (for experiments)
setup.py script (likely required anyway)
distutils
pyximport
Sage notebook
cython source.pyx + manual C compilation
Example: distutils
A minimal setup.py script:
from distutils.core import setupfrom distutils.extension import Extensionfrom Cython.Distutils import build_ext ext_modules = [Extension("worker", ["worker.py"])] setup( name = 'stupid little app', cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules )
Run with
$ python setup.py build_ext --inplace
Example: pyximport
Build and import Cython code files (.pyx) on the fly
$ ls
worker.pyx$ PYTHONPATH=. python
Python 2.6.2 (r262:71600, Apr 17 2009, 11:29:30)[GCC 4.3.2] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import pyximport>>> pyximport.install()>>> import worker>>> worker<module 'worker' from '~/.pyxbld/.../worker.so'>>>> worker.HardWorker<class 'worker.HardWorker'>>>> worker.HardWorker(worker.add_simple_stuff).work_hard()
pyximporting Python modules
pyximport can also compile Python modules:
>>> import pyximport>>> pyximport.install(pyimport = True)>>> import shlex[lots of compiler errors from different modules ...]>>> help(shlex)
currently works for a few stdlib modules
falls back to normal Python import automatically
not production ready, but nice for testing :)
Writing executable programs
# file: hw.pydef hello_world(): import sys print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__':
hello_world()
Writing executable programs
# file: hw.pydef hello_world(): import sys print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__':
hello_world()
Compile, link and run:
$ cython --embed hw.py # <- embed a main() function$ gcc $CFLAGS -I/usr/include/python2.6 \
-o hw hw.c -lpython2.6 -lpthread -lm -lutil -ldl$ ./hw
Welcome to Python 2.6!
Part 3: Writing fast code
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
A simple example
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
Type declarations in Cython
Function arguments are easy
Python:
def f(x): return sin(x**2)
Cython:
def f(double x): return sin(x**2)
Type declarations in Cython
»cdef« keyword declares
variables with C or builtin types
cdef double dx, s
functions with C signatures
cdef double f(double x): return sin(x**2)
classes as 'builtin' extension types
cdef class MyType: cdef int field
Functions: def vs. cdef vs. cpdef
def func(int x):
part of the Python module API
Python call semantics
cdef int func(int x):
C signature
C call semantics
cpdef int func(int x):
Python wrapper around cdef function
C calls cdef function, Python calls wrapper
note: modified C signature!
Typed arguments and return values
def func(int x):
caller passes Python objects for x
function converts to int on entry
implicit return type always object
cdef int func(int x):
caller converts arguments as required
function receives C int for x
arbitrary return type, defaults to object
cpdef int func(int x):
wrapper converts
C callers convert arguments as required
Python callers pass and receive objects
A simple example: Python
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
A simple example: Cython
# integrate_cy.pyxcdef extern from "math.h":
double sin(double x)cdef double f(double x): return sin(x**2)cpdef double integrate_f(double a, double b, int N): cdef double dx, s cdef int i
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
Overriding declarations in .pxd
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
Overriding declarations in .pxd
Python integrate_py.py | Cython integrate_py.pxd |
|
|
The .pxd file used
# integrate_py.pxdcimport cythoncpdef double f(double x): return sin(x**2)cpdef double integrate_f(double a, double b, int N)
Overriding declarations in .pxd
advantage:
Eclipse, pylint, 2to3, ...
runs unchanged in Python interpreter
plain Python code
complete Python tool-chain available
drawback:
cannot override from math import sin
no access to C functions
Typing in Python syntax
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
Typing in Python syntax
from math import sinimport cython@cython.locals(x=cython.double)def f(x): return sin(x**2)@cython.locals(a=cython.double, b=cython.double,
N=cython.Py_ssize_t, dx=cython.double,
s=cython.double, i=cython.Py_ssize_t)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0 for i in range(N):
s += f(a+i*dx) return s * dx
Declaring Python types
Access to Python's builtins is heavily optimised
for ... in range()/list/tuple/dict
list.append(), list.reverse()
set([...]), tuple([...])
Further improvements in Cython 0.12
replacements for enumerate(), type()
dict([...]), unicode.encode(), list.sort()
Declaring Python types is often worth it!
Easy to add new optimisations
don't write prematurely optimised code, fix Cython!
Declaring Python types: dict
example: dict iteration
def filter_a(d): return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
Declaring Python types: dict
simple change, ~30% faster:
def filter_a(dict d): # <==== return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
Declaring Python types: dict
simple change, ~30% faster:
def filter_a(dict d): # <==== return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
drawback:
non-dict mapping arguments raise a TypeError
Think twice before you type
benchmark code before adding static types!
Classes
class MyClass(object):
Python class with __dict__
multiple inheritance
arbitrary Python attributes
Python methods
monkey-patcheable etc.
cdef class MyClass(SomeSuperClass):
C-only access by default, or readonly/public
only from other extension types!
"builtin" extension type
single inheritance
fixed, typed fields
Python + C methods
cdef classes - when to use them?
Use cdef classes
e.g. whenever wrapping C structs/pointers/etc.
when C attribute types are used
when the need for speed beats Python's generality
Use Python classes
for bytes/tuple subtypes (PyVarObject)
for exceptions if Py<2.5 compatibility is required
when multiple inheritance is required
when users are allowed to monkey-patch
Part 4: Talking to other extensions
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
Talking to other extensions
Python 3 buffer protocol (available in Py2.6)
external C-APIs
Python 3 buffer protocol
Native support for new Python buffer protocol
PEP 3118
def inplace_invert_2D_buffer( object[unsigned char, 2] image): cdef int i, j for i in range(image.shape[0]): for j in range(image.shape[1]):
image[i, j] = 255 - image[i, j]
can be supported for extension types in Py2.x
declared through .pxd files
Cython ships with numpy.pxd
array.pxd available (stdlib's array)
Conclusion
Cython is a tool for
translating Python code to efficient C
easily interfacing to external C/C++/Fortran code
Use it to
concentrate on the mapping, not the glue!
don't change the language just to get fast code!
concentrate on optimisations, not rewrites!
speed up existing Python modules
write C extensions for CPython
wrap C libraries in Python
... but Cython is also
a great project
a very open playground for great ideas!
Cython
Cython
C-Extensions in Python
... use it, and join the project!
来源:oschina
链接:https://my.oschina.net/u/583625/blog/468700