Numpy:zero mean data and standardization

后端 未结 4 1315
既然无缘
既然无缘 2021-02-19 19:08

I saw in tutorial (there were no further explanation) that we can process data to zero mean with x -= np.mean(x, axis=0) and normalize data with x /= np.std(x

4条回答
  •  时光取名叫无心
    2021-02-19 20:03

    Key here are the assignment operators. They actually performs some operations on the original variable. a += c is actually equal to a=a+c.

    So indeed a (in your case x) has to be defined beforehand.

    Each method takes an array/iterable (x) as input and outputs a value (or array if a multidimensional array was input), which is thus applied in your assignment operations.
    The axis parameter means that you apply the mean or std operation over the rows. Hence, you take values for each row in a given column and perform the mean or std. Axis=1 would take values of each column for a given row.

    What you do with both operations is that first you remove the mean so that your column mean is now centered around 0. Then, when you divide by std, you happen to reduce the spread of the data around this zero, and now it should roughly be in a [-1, +1] interval around 0.

    So now, each of your column values is centered around zero and standardized.

    There are other scaling techniques, such as removing the minimal or maximal value and dividing by the range of values.

提交回复
热议问题