Detect duplicated groups in stream

后端 未结 3 1754
没有蜡笔的小新
没有蜡笔的小新 2021-02-20 10:32

I want to ensure that all numbers in the list are grouped together. Let me explain this on examples:

{1, 1, 1, 2, 2}    // OK, two distinct groups
{1, 1, 2, 2, 1         


        
3条回答
  •  爱一瞬间的悲伤
    2021-02-20 11:11

    More of an addition to what has been said already, we could try to answer this question using collect method. The problem with this approach (as others have indicated) is that a reduction operations do not terminate quickly.

    Generally, to short-circuit a long reduction operation, we can short-circuit the reduction function. This way, although we still iterate through all items in the stream, the amount of work required is minimal.

    public static boolean hasUniqueGroups(int... arr) {
        return !IntStream
            .of(arr) 
            .collect(
                    Container::new, // 1
                    (container, current) -> {
                        if (container.skip) return; // 2
                        if (current != container.previous) {
                            container.previous = current;
                            if (!container.integers.add(current))
                                container.skip = true; // 3
                        }
                    },
                    (c1, c2) -> {
                        if (c1.skip != c2.skip) {
                            c1.skip = true;
                            c1.integers.addAll(c2.integers);
                        }
                    }
            )
            .skip;
    }
    
    private static class Container {
        private int previous = MAX_VALUE; // 4
        private boolean skip = false;
        private Set integers = new HashSet<>();
    }
    
    1. We create Supplier which will create new Container for each computation. Container (amongst other things) will hold information if we should continue or skip computation.
    2. If at some point we encountered non-unique group, we will skip the entire computation.
    3. If we are currently at the beginning of a new group, we check if it is unique. If not, we decide to skip the rest of the stream.
    4. This is a poor hack to solve the problem when we have sequence {0, 1, 0}. Of course, this solution will not work for i.e. {MAX_VALUE, 0, MAX_VALUE}. I decided to leave this problem for simplicity reason.

    We can check the performance by replacing

    IntStream.of(arr)
    

    to

    IntStream.concat(IntStream.of(1, 2), IntStream.range(1, Integer.MAX_VALUE))
    

    which returns false. This of course will not work for infinite streams, but checking unique groups in infinite stream does not really make sense.

提交回复
热议问题