I found the bottleneck in my python code, played around with psycho etc. Then decided to write a c/c++ extension for performance.
With the help of swig you almost do
An observation: Based on the benchmarking conducted by the pybindgen developers, there is no significant difference between boost.python and swig. I haven't done my own benchmarking to verify how much of this depends on the proper use of the boost.python functionality.
Note also that there may be a reason that pybindgen seems to be in general quite a bit faster than swig and boost.python: it may not produce as versatile a binding as the other two. For instance, exception propagation, call argument type checking, etc. I haven't had a chance to use pybindgen yet but I intend to.
Boost is in general quite big package to install, and last I saw you can't just install boost python you pretty much need the whole Boost library. As others have mentioned compilation will be slow due to heavy use of template programming, which also means typically rather cryptic error messages at compile time.
Summary: given how easy SWIG is to install and use, that it generates decent binding that is robust and versatile, and that one interface file allows your C++ DLL to be available from several other languages like LUA, C#, and Java, I would favor it over boost.python. But unless you really need multi-language support I would take a close look at PyBindGen because of its purported speed, and pay close attention to robustness and versatility of binding it generates.
Before giving up on your python code, have a look at ShedSkin. They claim better performance than Psyco on some code (and also state that it is still experimental).
Else, there are several choices for binding C/C++ code to python.
Boost is lengthy to compile but is really the most flexible and easy to use solution.
I have never used SWIG but compared to boost, it's not as flexible as it's generic binding framework, not a framework dedicated to python.
Next choice is Pyrex. It allows to write pseudo python code that gets compiled as a C extension.
There be dragons here. Don't swig, don't boost. For any complicated project the code you have to fill in yourself to make them work becomes unmanageable quickly. If it's a plain C API to your library (no classes), you can just use ctypes. It will be easy and painless, and you won't have to spend hours trawling through the documentation for these labyrinthine wrapper projects trying to find the one tiny note about the feature you need.
For sure you will always have a performance gain doing this by hand, but the gain will be very small compared to the effort required to do this. I don't have any figure to give you but I don't recommend this, because you will need to maintain the interface by hand, and this is not an option if your module is large!
You did the right thing to chose to use a scripting language because you wanted rapid development. This way you've avoided the early optimization syndrome, and now you want to optimize bottleneck parts, great! But if you do the C/python interface by hand you will fall in the early optimization syndrome for sure.
If you want something with less interface code, you can think about creating a dll from your C code, and use that library directly from python with cstruct.
Consider also Cython if you want to use only python code in your program.