I want to run a cpu intensive program in Python across multiple cores and am trying to figure out how to write C extensions to do this. Are there any code samples or tutorials
Have you considered using one of the python mpi libraries like mpi4py? Although MPI is normally used to distribute work across a cluster, it works quite well on a single multicore machine. The downside is that you'll have to refactor your code to use MPI's communication calls (which may be easy).