I have two 1 dimensional numpy vectors va
and vb
which are being used to populate a matrix by passing all pair combinations to a function.
cdist
is fast because it is written in highly-optimized C code (as you already pointed out), and it only supports a small predefined set of metric
s.
Since you want to apply the operation generically, to any given foo
function, you have no choice but to call that function na
-times-nb
times. That part is not likely to be further optimizable.
What's left to optimize are the loops and the indexing. Some suggestions to try out:
xrange
instead of range
(if in python2.x. in python3, range is already a generator-like)enumerate
, instead of range + explicitly indexingcython
or numba
, to speed up the looping process.If you can make further assumptions about foo
, it might be possible to speed it up further.