Count repeating integers in an array

问题

If I have this vector:

x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6]

I would like to get the position of each unique number according to itself.

y = [1 2 3 4 5 1 2 3 1 1 2 1 2 3 4]

At the moment I'm using:

y = sum(triu(x==x.')) % MATLAB 2016b and above

It's compact but obviously not memory efficient.

For the pure beauty of MATLAB programming I would avoid using a loop. Do you have a better simple implementation ?

Context:

My final goal is to sort the vector x but with the constraint that a number that appear N times has the priority over another number that has appeared more than N times:

[~,ind] = sort(y);
x_relative_sort = x(ind);
% x_relative_sort = 1   2   3   4   6   1   2   4   6   1   2   6   1   6   1

回答1:

Assuming x is sorted, here's one vectorized alternative using unique, diff, and cumsum:

[~, index] = unique(x);
y = ones(size(x));
y(index(2:end)) = y(index(2:end))-diff(index).';
y = cumsum(y);

And now you can apply your final sorting:

>> [~, ind] = sort(y);
>> x_relative_sort = x(ind)

x_relative_sort =

     1     2     3     4     6     1     2     4     6     1     2     6     1     6     1

回答2:

If you have positive integers you can use sparse matrix:

[y ,~] = find(sort(sparse(1:numel(x), x, true), 1, 'descend'));

Likewise x_relative_sort can directly be computed:

[x_relative_sort ,~] = find(sort(sparse(x ,1:numel(x),true), 2, 'descend'));

回答3:

Just for variety, here's a solution based on accumarray. It works for x sorted and containing positive integers, as in the question:

y = cell2mat(accumarray(x(:), x(:), [], @(t){1:numel(t)}).');

回答4:

You can be more memory efficient by only comparing to unique(x), so you don't have a large N*N matrix but rather N*M, where N=numel(x), M=numel(unique(x)).

I've used an anonymous function syntax to avoid declaring an intermediate matrix variable, needed as it's used twice - this can probably be improved.

f = @(X) sum(cumsum(X,2).*X); y = f(unique(x).'==x);

回答5:

Here's my solution that doesn't require sorting:

x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6 1 1 1];
y = cell2mat( splitapply(@(v){cumsum(v)},x,cumsum(logical([1 diff(x)]))) ) ./ x;

Explanation:

% Turn each group new into a unique number:
t1 = cumsum(logical([1 diff(x)]));
%  x = [1 1 1 1 1 2 2 2 3 4 4 6 6 6 6 1 1 1];
% t1 = [1 1 1 1 1 2 2 2 3 4 4 5 5 5 5 6 6 6];

% Apply cumsum separately to each group:
t2 = cell2mat( splitapply(@(v){cumsum(v)},x,t1) );
% t1 = [1 1 1 1 1 2 2 2 3 4 4 5  5  5  5 6 6 6];
% t2 = [1 2 3 4 5 2 4 6 3 4 8 6 12 18 24 1 2 3];

% Finally, divide by x to get the increasing values:
y = t2 ./ x;
%  x = [1 1 1 1 1 2 2 2 3 4 4 6  6  6  6 1 1 1];
% t2 = [1 2 3 4 5 2 4 6 3 4 8 6 12 18 24 1 2 3];

来源：https://stackoverflow.com/questions/54079558/count-repeating-integers-in-an-array

标签

matlab

sorting

vector

Sequence

counting