Theano broadcasting different to numpy's

a 夏天 提交于 2019-12-21 09:09:02

问题


Consider the following example of numpy broadcasting:

import numpy as np
import theano
from theano import tensor as T

xval = np.array([[1, 2, 3], [4, 5, 6]])
bval = np.array([[10, 20, 30]])
print xval + bval

As expected, the vector bval is added to each rows of the matrix xval and the output is:

[[11 22 33]
 [14 25 36]]

Trying to replicate the same behaviour in the git version of theano:

x = T.dmatrix('x')
b = theano.shared(bval)
z = x + b
f = theano.function([x], z)

print f(xval)

I get the following error:

ValueError: Input dimension mis-match. (input[0].shape[0] = 2, input[1].shape[0] = 1)
Apply node that caused the error: Elemwise{add,no_inplace}(x, <TensorType(int64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(int64, matrix)]
Inputs shapes: [(2, 3), (1, 3)]
Inputs strides: [(24, 8), (24, 8)]
Inputs scalar values: ['not scalar', 'not scalar']

I understand Tensor objects such as x have a broadcastable attribute, but I can't find a way to 1) set this correctly for the shared object or 2) have it correctly inferred. How can I re-implement numpy's behaviour in theano?


回答1:


Theano need all broadcastable dimensions to be declared in the graph before compilation. NumPy use the run time shape information.

By default, all shared variable dimsions aren't broadcastable, as their shape could change.

To create the shared variable with the broadcastable dimension that you need in your example:

b = theano.shared(bval, broadcastable=(True,False))

I'll add this information to the documentation.



来源:https://stackoverflow.com/questions/26574293/theano-broadcasting-different-to-numpys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!