Detect duplicated groups in stream

后端 未结 3 1752
没有蜡笔的小新
没有蜡笔的小新 2021-02-20 10:32

I want to ensure that all numbers in the list are grouped together. Let me explain this on examples:

{1, 1, 1, 2, 2}    // OK, two distinct groups
{1, 1, 2, 2, 1         


        
相关标签:
3条回答
  • 2021-02-20 11:11

    More of an addition to what has been said already, we could try to answer this question using collect method. The problem with this approach (as others have indicated) is that a reduction operations do not terminate quickly.

    Generally, to short-circuit a long reduction operation, we can short-circuit the reduction function. This way, although we still iterate through all items in the stream, the amount of work required is minimal.

    public static boolean hasUniqueGroups(int... arr) {
        return !IntStream
            .of(arr) 
            .collect(
                    Container::new, // 1
                    (container, current) -> {
                        if (container.skip) return; // 2
                        if (current != container.previous) {
                            container.previous = current;
                            if (!container.integers.add(current))
                                container.skip = true; // 3
                        }
                    },
                    (c1, c2) -> {
                        if (c1.skip != c2.skip) {
                            c1.skip = true;
                            c1.integers.addAll(c2.integers);
                        }
                    }
            )
            .skip;
    }
    
    private static class Container {
        private int previous = MAX_VALUE; // 4
        private boolean skip = false;
        private Set<Integer> integers = new HashSet<>();
    }
    
    1. We create Supplier which will create new Container for each computation. Container (amongst other things) will hold information if we should continue or skip computation.
    2. If at some point we encountered non-unique group, we will skip the entire computation.
    3. If we are currently at the beginning of a new group, we check if it is unique. If not, we decide to skip the rest of the stream.
    4. This is a poor hack to solve the problem when we have sequence {0, 1, 0}. Of course, this solution will not work for i.e. {MAX_VALUE, 0, MAX_VALUE}. I decided to leave this problem for simplicity reason.

    We can check the performance by replacing

    IntStream.of(arr)
    

    to

    IntStream.concat(IntStream.of(1, 2), IntStream.range(1, Integer.MAX_VALUE))
    

    which returns false. This of course will not work for infinite streams, but checking unique groups in infinite stream does not really make sense.

    0 讨论(0)
  • 2021-02-20 11:18

    Using my free StreamEx library:

    IntStreamEx.of(numbers).boxed().runLengths().toMap();
    

    This code will throw IllegalStateException if there are repeating groups.

    Here runLengths() method is used. It collapses equal adjacent elements replacing them with Map.Entry where key is the input element and value is the number of repeats. Finally toMap() is used which is a shortcut for .collect(Collectors.toMap(Entry::getKey, Entry::getValue)). We are using the fact that .toMap() throws IllegalStateException when keys repeat (unless custom mergeFunction is supplied).

    As a free bonus on successful execution you will have a map where keys are input elements and values are lengths of series.

    0 讨论(0)
  • 2021-02-20 11:23

    In my opinion this problem doesn't fit the Stream API at all but I was curious how this could be implemenented (however in a performant way).

    The problem is that you have to keep track of seen elements and the whole test should have a short-circuit behaviour. So I came up with this solution (without Streams):

    public static boolean hasUniqueGroups(int[] arr) {
        Objects.requireNonNull(arr);
        Set<Integer> seen = new HashSet<>();
        for (int i = 0; i < arr.length; i++) {
            if (i == 0 || arr[i] != arr[i - 1]) {
                if (!seen.add(arr[i])) {
                    return false;
                }
            }
        }
        return true;
    }
    

    The next step is to introduce the Stream API and the solution looks like this:

    public static boolean hasUniqueGroups(int[] arr) {
        Objects.requireNonNull(arr);
        Set<Integer> seen = new HashSet<>();
        return IntStream.range(0, arr.length)
                .filter(i -> i == 0 || arr[i] != arr[i - 1])
                .mapToObj(i -> arr[i])
                .allMatch(seen::add);
    }
    

    Note: In order to parallelize this Stream you should use a thread-safe Set.

    0 讨论(0)
提交回复
热议问题