`accumarray` makes anomalous calls to its function argument

元气小坏坏 提交于 2019-12-01 03:36:31

Short answer

The fourth input argument of accumarray, anon in this case, must return a scalar for any input.

Long answer (and discussion about index sorting)

Consider the output when the indexes are sorted:

>> [idxsSorted,sortInds] = sort(idxs)
>> accumarray(idxsSorted, vals0(sortInds), [], anon)
ans =
     6
     5
     7
>> accumarray(idxsSorted, vals1(sortInds), [], anon)
ans =
     6
     5
     7

Now, all the documentation has to say about this is the following:

If the subscripts in subs are not sorted, fun should not depend on the order of the values in its input data.

How does this relate the trouble with anon? It is a clue, as this forces anon to be called for the complete set of values for a given idx rather than a subset/subarray, as Luis Mendo suggested.


Consider how accumarray would work for a non-sorted list of indexes and values:

>> [idxs vals0 vals1]
ans =
     1     1     1
     2     4   Inf
     3     6     6
     1     3     3
     2     5     5
     3     7     7
     1     6     6
     2   Inf     4
     3     2     2

For both vals0 and vals1, the Inf belongs to the set where idxs equals 2. Since idxs is not sorted, it does not process all values for idxs=2 in one shot, at first. The actual algorithm (implementation) is opaque, but it seems to start by assuming that idxs is sorted, processing each single-valued block of the first argument. This is verifiable by putting a breakpoint in fun, the function reference by fourth input argument. When it encounters a 1 in idxs for the second time, it seems to start over, but with subsequent calls to fun containing all the values for a given index. Presumably accumarray calls some implementation of unique to fully-segment idxs (incidentally, order is not preserved). As kjo suggests, this is the point where accumarray actually processes the inputs as described in the documentation, following steps 1-5 here ("Find out how many unique indices there are..."). As a result, it crashes for vals1, when anon(Inf) is called, but not for vals0, which instead calls anon(4) on the first try.

However, even if it followed those steps exactly on the first go, it would not necessarily be robust if a complete subarray of values contained just Infs (consider that anon([Inf Inf Inf]) returns an empty matrix too). It is a requirement, although an understated one, that fun must return a scalar. What is not clear from the documentation is that it must return a scalar, for any inputs, not just what is expected based on the high-level description of the algorithm.


Workaround:

anon = @(x) max([x(~isinf(x));-Inf]);

The documentation does not say that anon is called only with the whole set1 of vals corresponding to each value of idx as its input. As seen in your example, it does get called with subsets thereof.

So the way to make anon robust seems to be: make sure it gives a scalar output when its input is any subset of vals (or maybe just any subset of each set with same-idx value). In your case, anon(inf) does not return a scalar.

1 It's actually an array, of course, but I think it's easier to describe this in terms of sets (and subsets).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!