问题
Consider the following code:
import numpy as np
A = np.array([[.8, .6], [.1, 0]])
B1 = tf.keras.utils.normalize(A, axis=0, order=1)
B2 = tf.keras.utils.normalize(A, axis=0, order=2)
print('A:')
print(A)
print('B1:')
print(B1)
print('B2:')
print(B2)
which returns
A:
[[0.8 0.6]
[0.1 0. ]]
B1:
[[0.88888889 1. ]
[0.11111111 0. ]]
B2:
[[0.99227788 1. ]
[0.12403473 0. ]]
I understand how B1
is computed via order=1
such that each entry in A
is divided by the sum of the elements in its column. For example, 0.8
becomes 0.8/(0.8+0.1) = 0.888
. However, I just can't figure out how order=2
produces B2
nor can I find any documentation about it.
回答1:
However, I just can't figure out how order=2 produces B2 nor can I find any documentation about it.
order=1
means L1 norm while order=2
means L2 norm. For L2 norm, You need to take the square root after summing the individual squares. Which elements to square depends on the axis.
Keras
A = np.array([[.8, .6], [.1, 0]])
B2 = tf.keras.utils.normalize(A, axis=0, order=2)
print(B2)
array([[0.99227788, 1. ],
[0.12403473, 0. ]])
Manual
B2_manual = np.zeros((2,2))
B2_manual[0][0] = 0.8/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[1][0] = 0.1/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[0][1] = 0.6/np.sqrt(0.6 ** 2 + 0 ** 2)
B2_manual[1][1] = 0 /np.sqrt(0.6 ** 2 + 0 ** 2)
print(B2_manual)
array([[0.99227788, 1. ],
[0.12403473, 0. ]])
You can look up the different types of Norm here: https://en.wikipedia.org/wiki/Norm_(mathematics) Worked examples: https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html
回答2:
Order 1 normalize the input such that the sum of the absolute value of all element is 1 (L1 norm of the input equals 1). Order 2 normalize the input such that the sum of the squared value of all element is 1 (L2 norm of the input equals 1).
回答3:
Passing order 2 in the order
parameter, means you will be applying Tikhonov regularization commonly known as L2 or Ridge. L1 and L2 are different regularization techniques, both with pros and cons you can read in detail here in wikipedia and here in kaggle. The approach for L2 is to solve the standard equation for regresison, when calculating residual sum of squares adding an extra term λβTβ which is the square of the transposed matrix Beta (that's why it is called L2, because of the sqaure).
来源:https://stackoverflow.com/questions/58174470/what-does-the-order-argument-mean-in-tf-keras-utils-normalize