What are the inputs to the first decoder layer in a Transformer model?

前端 未结 0 1819
无人共我
无人共我 2021-02-05 14:07

I am trying to wrap my head around how the Transformer architecture works. I think I have a decent top-level understanding of the encoder part, sort of how the Key, Query, and V

相关标签:
回答
  • 消灭零回复
提交回复
热议问题