I am reading through Residual learning, and I have a question. What is "linear projection" mentioned in 3.2? Looks pretty simple once got this but could not get th
A linear projection is one where each new feature is simple a weighted sum of the original features. As in the paper, this can be represented by matrix multiplication. if x
is the vector of N
input features and W
is an M
-byN
matrix, then the matrix product Wx
yields M
new features where each one is a linear projection of x
. Each row of W
is a set of weights that defines one of the M
linear projections (i.e., each row of W
contains the coefficients for one of the weighted sums of x
).