Getting Started With Cython
Quote about Cython:
Andrew Tipton says "I'm honestly never going back to writing C again. Cython gives me all the expressiveness of Python combined with all the performance and close-to-the-metal-godlike-powers of C. I've been using it to implement high-performance graph traversal and routing algorithms and to interface with C/C++ libraries, and it's been an absolute amazing productivity boost." Yep.Cython has two major use cases
- Extending the CPython interpreter with fast compiled modules,
- Interfacing Python code with external C/C++ libraries.
Cython supports type declarations
- For changing code from having dynamic Python semantics into having static-and-fast (but less generic) C semantics.
- Directly manipulating C data types defined in external libraries.
Tutorial: Building Your First Cython Code by Hand
It happens in two stages:- A .pyx file is compiled by Cython to a .c or .cpp file.
- The .c or .cpp file is compild by a C compiler (such as GCC) to a .so file.
First, create a file sum.pyx that contains the following code (see this directory for original code files):
def sum_cython(long n): cdef long i, s = 0 for i in range(n): s += i return sThen use Cython to compile it:
Since we're using Sage, you can do
bash$ sage -cython sum.pyx bash$ ls sum.c sum.pyxNotice the new file sum.c. Compile it with gcc as follows on OS X:
bash$ sage -sh bash$ gcc -I$SAGE_ROOT/local/include/python2.6 -bundle -undefined dynamic_lookup sum.c -o sum.so
On Linux, do:
bash$ sage -sh bash$ gcc -I$SAGE_ROOT/local/include/python2.6 -shared -fPIC sum.c -o sum.soFinally, try it out.
You must run Sage from the same directory that contains the file sum.so. When you type import sum below, the Python interpreter sees the file sum.so, opens it, and it contains functions and data that define a compiled "Python C-extension module", so Python can load it (like it would like a module like sum.py).
bash$ sage ------------------------------------------------- | Sage Version 4.6, Release Date: 2010-10-30 | Type notebook() for the GUI, and license() for ------------------------------------------------- sage: import sum sage: sum.sum_cython(101) 5050 sage: timeit('sum.sum_cython(101)') 625 loops, best of 3: 627 ns per loop sage: timeit('sum.sum_cython(101)', number=10^6) # better quality timing 1000000 loops, best of 3: 539 ns per loop
Finally, take a look at the (more than 1000 line) autogenerated C file sum.c:
bash$ wc -l sum.c 1178 sum.c bash$ less sum.c...
Notice code like this, which illustrates that Cython generates code that supports both Python2 and Python3:
#if PY_MAJOR_VERSION < 3 #define __Pyx_BUILTIN_MODULE_NAME "__builtin__" #else #define __Pyx_BUILTIN_MODULE_NAME "builtins" #endif #if PY_MAJOR_VERSION >= 3 #define Py_TPFLAGS_CHECKTYPES 0 #define Py_TPFLAGS_HAVE_INDEX 0 #endifThe official Python docs say: "If you are writing a new extension module, you might consider Cython. It translates a Python-like language to C. The extension modules it creates are compatible with Python 3.x and 2.x."
If you scroll down further you'll get past the boilerplate and see the actual code:
... /* "/Users/wstein/edu/2010-2011/581d/notes/2010-11-08/sum.pyx":2 * def sum_cython(long n): * cdef long i, s = 0 # <<<<<<<<<<<<<< * for i in range(n): * s += i */ __pyx_v_s = 0; /* "/Users/wstein/edu/2010-2011/581d/notes/2010-11-08/sum.pyx":3 * def sum_cython(long n): * cdef long i, s = 0 * for i in range(n): # <<<<<<<<<<<<<< * s += i * return s */ __pyx_t_1 = __pyx_v_n; for (__pyx_t_2 = 0; __pyx_t_2 < __pyx_t_1; __pyx_t_2+=1) { __pyx_v_i = __pyx_t_2; /* "/Users/wstein/edu/2010-2011/581d/notes/2010-11-08/sum.pyx":4 * cdef long i, s = 0 * for i in range(n): * s += i # <<<<<<<<<<<<<< * return s */ __pyx_v_s += __pyx_v_i; } ...There is a big comment that shows the original Cython code with context and a little arrow pointing at the current line (these comment blocks with context were I think the first thing I personally added to Pyrex... before, it just gave that first line with the .pyx filename and line number, but nothing else). Below that big comment, there is the actual C code that Cython generates. For example, the Cython code s += i is turned into the C code __pyx_v_s += __pyx_v_i;.
The Same Extension From Scratch, for Comparison
If you read Extending and Embedding Python you'll see how you could write a C extension module from scratch that does the same thing as sum.so above. Let's see what this is like, for comparison. Given how simple sum.pyx is, this isn't so hard. When creating more complicated Cython code---e.g., new extension classes, more complicated type conversions, and memory management---writing C code directly quickly becomes unwieldy.First, create a file sum2.c as follows:
#include <Python.h> static PyObject * sum2_sum_c(PyObject *self, PyObject *n_arg) { long i, s=0, n = PyInt_AsLong(n_arg); for (i=0; i<n; i++) { s += i; } PyObject* t = PyInt_FromLong(s); return t; } static PyMethodDef Sum2Methods[] = { {"sum_c", sum2_sum_c, METH_O, "Sum the numbers up to n."}, {NULL, NULL, 0, NULL} /* Sentinel */ }; PyMODINIT_FUNC initsum2(void) { PyObject *m; m = Py_InitModule("sum2", Sum2Methods); }Now compile and run it as before:
bash$ sage -sh
bash$ gcc -I$SAGE_ROOT/local/include/python2.6 -bundle -undefined dynamic_lookup sum2.c -o sum2.so bash$ sage ... sage: import sum2 sage: sum2.sum_c(101) 5050 sage: import sum sage: sum.sum_cython(101) 5050 sage: timeit('sum.sum_cython(1000000r)') 125 loops, best of 3: 2.54 ms per loop sage: timeit('sum2.sum_c(1000000r)') 125 loops, best of 3: 2.03 ms per loopNote that this is a little faster than the corresponding Cython code. This is because the Cython code is more careful, checking various error conditions, etc.
Note that the C code is 5 times as long as the Cython code.
Building Extensions using Setuptools Instead
In nontrivial projects, the Cython step of transforming your code from .pyx to .c is typically done by explicitly calling cython somehow (this will change in the newest version of Cython), but the step of running the C compiler is usually done using either distutils or setuptools. To use the tools, one creates a file "setup.py" which defines the extensions in your project, and Python itself then runs a C compiler for you, with the proper options, includes paths, etc.Let's create a new setuptools project that includes the sum and sum2 extensions that we defined above. First, create the following file and call it setup.py. This should be in the same directory as sum.c and sum2.c.
from setuptools import setup, Extension ext_modules = [ Extension("sum", ["sum.c"]), Extension("sum2", ["sum2.c"]) ] setup( name = 'sum', version = '0.1', ext_modules = ext_modules)Then type
bash$ rm *.so # make sure something happens bash$ sage setup.py develop ... bash$ ls *.so sum.so sum2.so
Notice that running
setup.py develop
resulted in Python generating the right gcc commmand lines for your platform. You don't have to do anything differently on Linux, OS X, etc.
If you change sum2.c, and want to rebuild it, just type sage setup.py develop again to rebuild sum2.so If you change sum.pyx, you have to manually run Cython:
sage -cython sum.pyx
then again do sage setup.py develop to rebuild sum.so. Try this now. In sum.pyx, change
for i in range(n):
to
for i in range(1,n+1):
then rebuild:
bash$ sage -cython sum.pyx ... bash$ sage setup.py develop ... bash$ sage ... sage: import sum sage: sum.sum_cython(100) 5050
There are ways to make setup.py automatically notice when sum.pyx changes, and run Cython. A nice implementation of this will be in the next Cython release. See the setup.py and build_system.py files of Purple sage for an example of how to write a little build system write now (before the new version of Cython).
An Automated Way to Experiment
Given any single Cython file such as sum.pyx, in Sage you can dosage: load sum.pyx Compiling sum.pyx... sage: sum_cython(100) 5050Behind the scenes, Sage created a setup.py file, ran Cython, made a new module, compiled it, and imported everything it defines into the global namespace. If you look in the spyx subdirectory of the directory listed below, before you exit Sage (!), then you'll see all this.
sage: SAGE_TMP '/Users/wstein/.sage//temp/deep.local/14837/'
You can also do
sage: attach sum.pyx
Then every time sum.pyx changes, Sage will notice this and reload it. This can be useful for development of small chunks of Cython code.
You can also use the Sage notebook, and put %cython as the first line of a notebook cell. The rest of the cell will be compiled exactly as if it were written to a .pyx file and loaded as above. In fact, that is almost exactly what happens behind the scenes.
Next Time
Now that we understand at a reasonably deep level what Cython really is and does, it is time to learn about the various constructs of the language:- How to create extension classes using Cython.
- How to call external C/C++ library code.
We will rewrite our sum.pyx file first to use a class. Then we'll rewrite it again to make use of the MPIR (or GMP) C library for arithmetic, and again to make use of the C++ NTL library.