问题
I have vector with values between 1
and N > 1
. Some values COULD occur multiple times consecutively. Now I want to have a second row which counts the consecutively entries and remove all those consecutively occuring entries, e.g.:
A = [1 2 1 1 3 2 4 4 1 1 1 2]'
would lead to:
B = [1 1;
2 1;
1 2;
3 1;
2 1;
4 2;
1 3;
2 1]
(you see, the second column contains the number of consecutively entries!
I came across accumarray()
in MATLAB recently but I can't find any solution with it for this task since it always regards the whole vector and not only consecutively entries.
Any idea?
回答1:
This probably isn't the most readable or elegant way of doing it, but if you have large vectors and speed is an issue, this vectorisation may help...
A = [1 2 1 1 3 2 4 4 1 1 1 2];
First I'm going to pad A with a leading and trailing zero to capture the first and final transitions
>> A = [0, A, 0];
The transition locations can be found where the difference between neighbouring values is not equal to zero:
>> locations = find(diff(A)~=0);
But because we padded the start of A with a zero, the first transition is nonsensical, so we only take the locations from 2:end. The values in A of these are the value of each segment:
>> first_column = A(locations(2:end))
ans =
1 2 1 3 2 4 1 2
That's the first colomn - now to find the count of each number. This can be found from the difference in locations. This is where padding A at both ends becomes important:
>> second_column = diff(locations)
ans =
1 1 2 1 1 2 3 1
Finally combining:
B = [first_column', second_column']
B =
1 1
2 1
1 2
3 1
2 1
4 2
1 3
2 1
This can all be combined into one less-readable line:
>> A = [1 2 1 1 3 2 4 4 1 1 1 2]';
>> B = [A(find(diff([A; 0]) ~= 0)), diff(find(diff([0; A; 0])))]
B =
1 1
2 1
1 2
3 1
2 1
4 2
1 3
2 1
回答2:
I don't see another way then looping through the data set, but it is rather straight forward. Maybe this is not the most elegant solution, but as far as I can see, it works fine.
function B = accum_data_set(A)
prev = A(1);
count = 1;
B = [];
for i=2:length(A)
if (prev == A(i))
count = count + 1;
else
B = [B;prev count];
count = 1;
end
prev = A(i);
end
B = [B;prev count];
output:
>> A = [1 2 1 1 3 2 4 4 1 1 1 2]';
>> B = accum_data_set(A)
B =
1 1
2 1
1 2
3 1
2 1
4 2
1 3
2 1
来源:https://stackoverflow.com/questions/8941582/how-to-accumulate-data-sets