I am trying to calculate the pairwise distances between multiple time-series contained in a numpy array. Please see the code below
print(type(sales))
print(sales
To be honest, fastdtw
is not fast at all
from cdtw import pydtw
from dtaidistance import dtw
from fastdtw import fastdtw
from scipy.spatial.distance import euclidean
s1=np.array([1,2,3,4],dtype=np.double)
s2=np.array([4,3,2,1],dtype=np.double)
%timeit dtw.distance_fast(s1, s2)
4.1 µs ± 28.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit d2 = pydtw.dtw(s1,s2,pydtw.Settings(step = 'p0sym', window = 'palival', param = 2.0, norm = False, compute_path = True)).get_dist()
45.6 µs ± 3.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit d3,_=fastdtw(s1, s2, dist=euclidean)
901 µs ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
fastdtw
is 219 times slower than dtaidistance
lib and 20x slower than cdtw
Consider changing. Here is dtaidistance
git:
https://github.com/wannesm/dtaidistance
To install, just:
pip install dtaidistance
TL;DR
Your fastdtw
falled to install the fast cpp-version and falls back silently to a pure-python version, which is slow.
You need to fix the installation of the fastdtw
-package.
The whole calculation is done in fastdtw
, so you cannot really speed it up from the outside. And parallelization and python is not such an easy thing (yet?).
The fastdtw
documentation says it needs about O(n)
operations for a comparison, so for your whole test-set it will need about order of magnitude of 10^9
operations, which should be finished in about some seconds, if programmed in, for example, C. The performance you see is nowhere near it.
If we look at the code of fastdtw we see, that there are two versions: the cython/cpp-version which is fast and imported via cython and a slow fall back pure-python-version. If the fast version isn't preset, the slow python version is silently used.
So run your calculation, interrupt it with Ctr+C
and you will see, that you are somewhere in python-code. You can also go to your lib-folder and see, that there is only the pure-python version inside.
So your installation of the fast fastdtw
version failed. Actually, I think the wheel-package is botched, at least for my version there is only the pure python code present.
What to do?
git clone https://github.com/slaypni/fastdtw
fstdtw
folder and run python setup.py build
fatal error: numpy/npy_math.h: No such file or directory
For me, the fix was to change the following lines in setup.py
:
import numpy # THIS ADDED
extensions = [Extension(
'fastdtw._fastdtw',
[os.path.join('fastdtw', '_fastdtw' + ext)],
language="c++",
include_dirs=[numpy.get_include()], # AND ADDED numpy.get_include()
libraries=["stdc++"]
)]
python setup.py install
Now your program should be about 100 times faster. `