What is a projection layer in the context of neural networks?

前端 未结 3 1709
暖寄归人
暖寄归人 2021-01-30 02:14

I am currently trying to understand the architecture behind the word2vec neural net learning algorithm, for representing words as vectors based on their context.

<
3条回答
  •  无人共我
    2021-01-30 03:09

    I find the previous answers here a bit overcomplicated - a projection layer is just a simple matrix multiplication, or in the context of NN, a regular/dense/linear layer, without the non-linear activation in the end (sigmoid/tanh/relu/etc.) The idea is to project the (e.g.) 100K-dimensions discrete vector into a 600-dimensions continuous vector (I chose the numbers here randomly, "your mileage may vary"). The exact matrix parameters are learned through the training process.

    What happens before/after already depends on the model and context, and is not what OP asks.

    (In practice you wouldn't even bother with the matrix multiplication (as you are multiplying a 1-hot vector which has 1 for the word index and 0's everywhere else), and would treat the trained matrix as a lookout table (i.e. the 6257th word in the corpus = the 6257th row/column (depends how you define it) in the projection matrix).)

提交回复
热议问题