What does the `order` argument mean in `tf.keras.utils.normalize()`?

风流意气都作罢 提交于 2021-01-29 03:50:34

问题


Consider the following code:

import numpy as np

A = np.array([[.8, .6], [.1, 0]])
B1 = tf.keras.utils.normalize(A, axis=0, order=1)
B2 = tf.keras.utils.normalize(A, axis=0, order=2)

print('A:')
print(A)
print('B1:')
print(B1)
print('B2:')
print(B2)

which returns

A:
[[0.8 0.6]
 [0.1 0. ]]
B1:
[[0.88888889 1.        ]
 [0.11111111 0.        ]]
B2:
[[0.99227788 1.        ]
 [0.12403473 0.        ]]

I understand how B1 is computed via order=1 such that each entry in A is divided by the sum of the elements in its column. For example, 0.8 becomes 0.8/(0.8+0.1) = 0.888. However, I just can't figure out how order=2 produces B2 nor can I find any documentation about it.


回答1:


However, I just can't figure out how order=2 produces B2 nor can I find any documentation about it.

order=1 means L1 norm while order=2 means L2 norm. For L2 norm, You need to take the square root after summing the individual squares. Which elements to square depends on the axis.

Keras

A = np.array([[.8, .6], [.1, 0]])
B2 = tf.keras.utils.normalize(A, axis=0, order=2)
print(B2)

array([[0.99227788, 1.        ],
       [0.12403473, 0.        ]])

Manual

B2_manual = np.zeros((2,2))
B2_manual[0][0] = 0.8/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[1][0] = 0.1/np.sqrt(0.8 ** 2 + 0.1 ** 2)
B2_manual[0][1] = 0.6/np.sqrt(0.6 ** 2 + 0 ** 2)
B2_manual[1][1] =  0 /np.sqrt(0.6 ** 2 + 0 ** 2)
print(B2_manual)

array([[0.99227788, 1.        ],
       [0.12403473, 0.        ]])

You can look up the different types of Norm here: https://en.wikipedia.org/wiki/Norm_(mathematics) Worked examples: https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html




回答2:


Order 1 normalize the input such that the sum of the absolute value of all element is 1 (L1 norm of the input equals 1). Order 2 normalize the input such that the sum of the squared value of all element is 1 (L2 norm of the input equals 1).




回答3:


Passing order 2 in the order parameter, means you will be applying Tikhonov regularization commonly known as L2 or Ridge. L1 and L2 are different regularization techniques, both with pros and cons you can read in detail here in wikipedia and here in kaggle. The approach for L2 is to solve the standard equation for regresison, when calculating residual sum of squares adding an extra term λβTβ which is the square of the transposed matrix Beta (that's why it is called L2, because of the sqaure).



来源:https://stackoverflow.com/questions/58174470/what-does-the-order-argument-mean-in-tf-keras-utils-normalize

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!