Normalizing feature values for SVM

前端未结

关注

 2  1181

I\'ve been playing with some SVM implementations and I am wondering - what is the best way to normalize feature values to fit into one range? (from 0 to 1)

Let\'s su

相关标签:

2条回答

故里飘歌

2021-01-18 03:42

Besides scaling to unit length method provided by Tim, standardization is most often used in machine learning field. Please note that when your test data comes, it makes more sense to use the mean value and standard deviation from your training samples to do this scaling. If you have a very large amount of training data, it is safe to assume they obey the normal distribution, so the possibility that new test data is out-of-range won't be that high. Refer to this post for more details.

0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2021-01-18 03:50

You normalise a vector by converting it to a unit vector. This trains the SVM on the relative values of the features, not the magnitudes. The normalisation algorithm will work on vectors with any values.

To convert to a unit vector, divide each value by the length of the vector. For example, a vector of [4 0.02 12] has a length of 12.6491. The normalised vector is then [4/12.6491 0.02/12.6491 12/12.6491] = [0.316 0.0016 0.949].

If "in the wild" we encounter a vector of [400 2 1200] it will normalise to the same unit vector as above. The magnitudes of the features is "cancelled out" by the normalisation and we are left with relative values between 0 and 1.

0 讨论(0)
发布评论:

提交评论
- 加载中...