Remove unsorted/outlier elements in nearly-sorted array

只谈情不闲聊 提交于 2019-12-06 02:02:01

I would reduce your problem to the longest increasing (decreasing) subsequence problem.

https://en.wikipedia.org/wiki/Longest_increasing_subsequence

Since your sequence is nearly sorted, you are guaranteed to receive a satisfactory result (i.e. neatly following the trendline).

There exists a number of solutions to it; one of them is portrayed in the free book "Fundamentals of Computer Programming with C#" by Svetlin Nakov and Veselin Kolev; the problem is presented on page 257, exercise 6; solution is on page 260.

Taken from the book:

Write a program, which finds the maximal sequence of increasing elements in an array arr[n]. It is not necessary the elements to be consecutively placed. E.g.: {9, 6, 2, 7, 4, 7, 6, 5, 8, 4} -> {2, 4, 6, 8}.

Solution:

We can solve the problem with two nested loops and one more array len[0…n-1]. In the array len[i] we can keep the length of the longest consecutively increasing sequence, which starts somewhere in the array (it does not matter where exactly) and ends with the element arr[i]. Therefore len[0]=1, len[x] is the maximal sum max(1 + len[prev]), where prev < x and arr[prev] < arr[x]. Following the definition, we can calculate len[0…n-1] with two nested loops: the outer loop will iterate through the array from left to right with the loop variable x. The inner loop will iterate through the array from the start to position x-1 and searches for the element prev with maximal value of len[prev], where arr[prev] < arr[x]. After the search, we initialize len[x] with 1 + the biggest found value of len[prev] or with 1, if such a value is not found.

The described algorithm finds the lengths of all maximal ascending sequences, which end at each of the elements. The biggest one of these values is the length of the longest increasing sequence. If we need to find the elements themselves, which compose that longest sequence, we can start from the element, where the sequence ends (at index x), we can print it and we can search for a previous element (prev). By definition prev < x and len[x] = 1 + len[prev] so we can find prev with a for-loop from 1 to x-1. After that we can repeat the same for x=prev. By finding and printing the previous element (prev) many times until it exists, we can find the elements, which compose the longest sequence in reversed order (from the last to the first).

Pham Trung

A simple algorithm which has been described by higuaro can help you to generate a correct sequence:

For each element at index i , if a[i] < a[i + 1], we can simply remove that element a[i].

for(int i = 0; i < size; i++)
    while(a[i] < a[i + 1]){
       remove a[i];
       i--;
    }

However, this approach cannot guarantee that the number of removed element is minimum. For example, for this sequence [10, 9, 8, 100, 1, 0], remove 100 will be optimal, instead of remove 8, then 9 then 10.

To find the minimum number of element to be removed, we notice that we need to find the longest decreasing sub sequence, which is similar to the classic longest increasing sub sequence whose solution has been described here

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!