What is the difference between scipy.special.binom and scipy.misc.comb?
In ipython I can see they return different types and also have different accuracy.
Whenever you come across a situation where you do not know what some code is doing and it is not straightforwardly easy to simply import the parent module in some interpreter and inspect the execution of the code, the docstrings, etc., then you have a few options.
Let me describe two options that did not turn out to be very helpful this time, but which are a great way to start on a question like this in the future (so that you can include output from these attempts to prove to folks that you tried some stuff before posting here, and to look fancy for knowing what these things are):
You can use the dis
module to disassemble the Python code into executed op-codes, like so:
python -c "import dis; from scipy import misc; print dis.dis('misc.comb(4,3)')"
python -c "import dis; from scipy import special; print dis.dis('special.binom(4,3)')"
If using a *nix OS, you can also (almost always) use strace
to launch something and inspect the system calls that are made. In this case you could look at the output of
strace -qc python -c "from scipy import special; special.binom(4,3)"
versus
strace -qc python -c "from scipy import misc; special.comb(4,3)"
(the -qc
option makes the output less chatty and aggregates the time spent in different system calls, which can be easier to read as a first start. If you omit the -qc
part, you'll get a big screen dump of all the system calls... something you'll want to open in Emacs and search through, or pipe to a system tool for searching).
In this case, strace
doesn't help too much and there's a lot of noisy system calls related to just the imports of the modules.
What worked for me this time is cProfile
:
python -c "import cProfile; cProfile.run('from scipy import special; special.binom(4,3)')"
In this case, I was able to see that the first execution boils down to a system call into scipy.special.orthogonal_eval
and Googling this module revealed that we are talking about a shared library, built as the file orthogonal_eval.so
and I found this nice page with the source code.
You can see the full function definition of binom
there, which includes a standard calculation of the factorials involved in the formula for small values, and some approximation with other special functions (I see some calls to "Gamma", "beta" and "lbeta" defined in some file called cephes.h
).
Looks pretty standard -- if I needed to, I can go dig around that .h
file and Google some more and probably find a long-standing C library of special functions at the bottom of all of that.
Meanwhile, for misc.comb
first consider that the regular documentation of this is available (whereas, for me, it is not available for binom
).
The docstring says there is a possible 3rd argument, exact
which can be set to something other than 0 if you don't want a float
to be returned. In that case, a long
is returned, but you could cast to int
if you'd like.
This explains the precision question. You can read in orthogonal_eval
how if k
is relatively small, and the answer will give an integer (integer arguments), then something with less round off error is used. Whereas comb
will only do this if you say exact=1
(or anything else such that exact != 0
).
As for the code that comb
is executing, well, we can view the source that's linked from the SciPy documentation page.
This function is also making use of some helper function calls from scipy.special
but the mix between what is called in the Python layer vs. the C layer is different, and the implementation for the approximation part is slightly different as well.
I'm not sure what level of specificity you need, but I would say that for most purposes, this level of detail should suffice: binom
is implemented directly in the C extension layer in orthogonal_eval
and makes some tweaks to reduce precision issues for small inputs. misc
implements the small input stuff in Python directly and makes use of special
calls without passing through binom
itself -- so there was some amount of reinventing the wheel for whoever programmed these. Since they mix and match function calls between the Python layer and C layer, it's not surprising that there would be some precision differences.