Theano row/column wise subtraction

问题

X is an n by d matrix, W is an m by d matrix, for every row in X I want to compute the squared Euclidean distance with every row in W, so the results will be an n by m matrix.

If there's only one row in W, this is easy

x = tensor.TensorType("float64", [False, False])()
w = tensor.TensorType("float64", [False])()
z = tensor.sum((x-w)**2, axis=1)
fn = theano.function([x, w], z)
print fn([[1,2,3], [2,2,2]], [2,2,2])
# [ 2.  0.]

What do I do when W is a matrix (in Theano)?

回答1:

Short answer, use scipy.spatial.distance.cdist

Long answer, if you don't have scipy, is to broadcast subtract and then norm by axis 0.

np.linalg.norm(X[:,:,None]-W[:,None,:], axis=0)

Really long answer, of you have an ancient version of numpy without a vecorizable linalg.norm (i.e. you're using Abaqus) is

np.sum((X[:,:,None]-W[:,None,:])**2, axis=0).__pow__(0.5)

Edit by OP
In Theano we can make X and W both 3d matrices and make the corresponding axes broadcastable like

x = tensor.TensorType("float64", [False, True, False])()
w = tensor.TensorType("float64", [True, False, False])()

z = tensor.sum((x-w)**2, axis=2)

fn = theano.function([x, w], z)
print fn([[[0,1,2]], [[1,2,3]]], [[[1,1,1], [2,2,2]]])
# [[ 2.  5.]
#  [ 5.  2.]]

回答2:

Luckily the the number of rows in W can be known in advance, so I'm temporally doing

x = tensor.TensorType("float64", [False, False])()
m = 2
w = tensor.as_tensor([[2,2,2],[1,2,3]])
res_list = []
for i in range(m):
    res_list.append(ten.sum((x-w[i,:])**2, axis=1))

z = tensor.stack(res_list)

fn = theano.function([x], z)
print fn([[1,2,3], [2,2,2], [2,3,4]])

# [[ 2.  0.  5.]
#  [ 0.  2.  3.]]