问题
I have a dataset which has 4 columns/attributes and 150 rows. I want to normalize this data using min-max normalization. So far, my code is:
minData=min(min(data1))
maxData=max(max(data1))
minmaxeddata=((data1-minData)./(maxData))
Here, minData
and maxData
returns the global minimum and maximum values. Therefore, this code actually applies a min-max normalization over all values in the 2D matrix so that the global minimum is 0 and the global maximum is 1.
However, I would like to perform the same operation on each column individually. Specifically, each column of the 2D matrix should be min-max normalized independently from the other columns.
I tried using just using min(data1)
and max(data1)
, but got the error saying that the Matrix dimensions must agree.
However, by using the global minimum and maximum, I got the values in the range of [0-1]
and have done experimentations using this normalized dataset. I would like to know whether there is any problem in my results? Is there a problem in my understanding as well? Any guidance would be appreciated.
回答1:
If I understand you correctly, you wish to normalize each column of data1
. Also, as each column is an independent data set and most likely having different dynamic ranges, doing a global min-max operation is probably not recommended. I would recommend that you go with your initial thoughts in normalizing each column individually.
Going with your error, you can't subtract data1
with min(data1)
because min(data1)
would produce a row vector while data1
is a matrix. You are subtracting a matrix with a vector which is why you are getting that error.
If you want to achieve what you're asking, use bsxfun to broadcast the vector and repeat it for as many rows as you have data1
. Therefore:
mindata = min(data1);
maxdata = max(data1);
minmaxdata = bsxfun(@rdivide, bsxfun(@minus, data1, mindata), maxdata - mindata);
Example
>> data1 = [5 9 9 9 3 3; 3 10 2 1 10 1; 2 4 4 6 5 5]
data1 =
5 9 9 9 3 3
3 10 2 1 10 1
2 4 4 6 5 5
When I run the above normalization code, I get:
minmaxdata =
1.0000 0.8333 1.0000 1.0000 0 0.5000
0.3333 1.0000 0 0 1.0000 0
0 0 0.2857 0.6250 0.2857 1.0000
来源:https://stackoverflow.com/questions/29404157/min-max-normalization-of-individual-columns-in-a-2d-matrix