Additive Attention

Introduced by Bhdanau et al. in Neural Machine Translation by Jointly Learning to Align and Translate.

Additive Attention, also known as Bahdanau Attention, uses a one-hidden layer feed-forward network to calculate the attention alignment score:

where Va and Wa are learned attention parameters. Here h refers to the hidden states for the encoder, and s is the hidden states for the decoder. The function above is thus a type of alignment score function. We can use a matrix of alignment scores to show the correlation between source and target words, as the Figure to the right shows.

Within a neural network, once we have the alignment scores, we calculate the final scores using a softmax function of these alignment scores (ensuring it sums to 1).

来源：oschina

链接：https://my.oschina.net/u/4228078/blog/4523165

标签

Here

神经网络

layer

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!