Confused about keras Dot Layer. How is the Dot product computed?

a 夏天 提交于 2021-01-27 10:43:20

问题


I read all posts about the Dot Layer but none explains how this and so the output shape is computed! It seems so standard though! How exactly are the values computed with a along a specific axis?

val = np.random.randint(2, size=(2, 3, 4))
a = K.variable(value=val)
val2 = np.random.randint(2, size=(2, 2, 3))
b = K.variable(value=val)
print("a")
print(val)
print("b")
print(val2)
out = Dot(axes = 2)([a,b])
print(out.shape)
print("DOT")
print(K.eval(out))

I get:

a
[[[0 1 1 1]
  [1 1 0 0]
  [0 0 1 1]]

 [[1 1 1 0]
  [0 0 1 0]
  [0 1 0 0]]]
b
[[[1 0 1]
  [1 0 1]]

 [[1 0 1]
  [1 1 0]]]
(2, 3, 3)
DOT
[[[ 3.  1.  2.]
  [ 1.  2.  0.]
  [ 2.  0.  2.]]

 [[ 3.  1.  1.]
  [ 1.  1.  0.]
  [ 1.  0.  1.]]]

I cannot understand with my mathematical and algebraic matrix know-how how the heck this is computed?


回答1:


Here's how the Dot product works. Internally it is calling K.batch_dot.

First, I think you might have intended to do,

val = np.random.randint(2, size=(2, 3, 4))
a = K.variable(value=val)
val2 = np.random.randint(2, size=(2, 2, 3))
b = K.variable(value=val2) # You have val here

But fortunately, you had (or could have been your initial intention too. Anyway just pointing out)

b = K.variable(value=val)

If you had the intended code, it will throw an error because the dimension you want the dot product on, doesn't match. Moving on,

How dot product is computed

You have

a.shape = (2,3,4)
b.shape = (2,3,4)

First you are only performing element-wise dot over the batch dimension. So that dimension stays that way.

Now you can ignore the first dimension of both a and b and consider the dot product between two matrices (3,4) and (3,4) and do the dot product over the last axis, which results in a (3,3) matrix. Now add the batch dimension you get a,

(2, 3, 3) tensor

Let's now take your example. You got,

a
[[[0 1 1 1]
  [1 1 0 0]
  [0 0 1 1]]

 [[1 1 1 0]
  [0 0 1 0]
  [0 1 0 0]]]

b
[[[0 1 1 1]
  [1 1 0 0]
  [0 0 1 1]]

 [[1 1 1 0]
  [0 0 1 0]
  [0 1 0 0]]]

Then you do the following two dot products.

# 1st sample
[0 1 1 1] . [0 1 1 1]
[1 1 0 0] . [1 1 0 0]
[0 0 1 1] . [0 0 1 1]

# 2nd sample
[1 1 1 0] . [1 1 1 0]
[0 0 1 0] . [0 0 1 0]
[0 1 0 0] . [0 1 0 0]

This gives,

# 1st sample
[3 1 2]
[1 2 0]
[2 0 2]

# 2nd sample
[ 3 1 1]
[ 1 1 0]
[ 1 0 1]

Finally by adding the missing batch dimension you get,

[[[ 3.  1.  2.]
  [ 1.  2.  0.]
  [ 2.  0.  2.]]

 [[ 3.  1.  1.]
  [ 1.  1.  0.]
  [ 1.  0.  1.]]]


来源:https://stackoverflow.com/questions/59502733/confused-about-keras-dot-layer-how-is-the-dot-product-computed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!