Is Attention Linear a Transformation

前端未结

关注

 0  927

In sample for pytorch for SeqtoSeq attention is calculated with linear layer followed by softmax.

embedded = self.embedding(input).view(1, 1, -1)
embedded = s


                      
              相关标签: