I have a data set where each samples has a structure similar to this
X=[ [[],[],[],[]], [[],[]] , [[],[],[]] ,[[][]]]
for example:
You can also bypass the need for itertools.product
by directly doing the dot product on inner matrices:
def calc_matrix(l1, l2):
return np.array(l1).dot(np.array(l2).T).sum()
def kernel(x1, x2):
return sum(
calc_matrix(l1, l2)
for l1, l2 in zip(x1, x2)
)
Edit:
On short lists (less than a few thousand elements) this will be faster than Claudiu's (awesome) answer. His will scale better above these numbers:
Using Claudiu's benchmarks:
# len(l1) == 500
In [9]: %timeit calc_matrix(l1, l2)
10 loops, best of 3: 8.11 ms per loop
In [10]: %timeit calc_fast(l1, l2)
10 loops, best of 3: 14.2 ms per loop
# len(l2) == 5000
In [19]: %timeit calc_matrix(l1, l2)
10 loops, best of 3: 61.2 ms per loop
In [20]: %timeit calc_fast(l1, l2)
10 loops, best of 3: 56.7 ms per loop