问题
There are many resources on how to remove duplicates and similar issues but I can't seem to be able to find any on removing unique elements. I'm using SWI-Prolog but I don't want to use built-ins to achieve this.
That is, calling remove_unique([1, 2, 2, 3, 4, 5, 7, 6, 7], X).
should happily result in X = [2, 2, 7, 7]
.
The obvious solution is as something along the lines of
count(_, [], 0) :- !.
count(E, [E | Es], A) :-
S is A + 1,
count(E, Es, S).
count(E, [_ | Es], A) :-
count(E, Es, A).
is_unique(E, Xs) :-
count(E, Xs, 1).
remove_unique(L, R) :- remove_unique(L, L, R).
remove_unique([], _, []) :- !.
remove_unique([X | Xs], O, R) :-
is_unique(X, O), !,
remove_unique(Xs, O, R).
remove_unique([X | Xs], O, [X | R]) :-
remove_unique(Xs, O, R).
It should become quickly apparent why this isn't an ideal solution: count
is O(n)
and so is is_unique
as it just uses count
. I could improve this by fail
ing when we find more than one element but worst-case is still O(n)
.
So then we come to remove_unique
. For every element we check whether current element is_unique
in O
. If the test fails, the element gets added to the resulting list in the next branch. Running in O(n²)
, we get a lot of inferences. While I don't think we can speed it in the worst case, can we do better than this naïve solution? The only improvement that I can clearly see is to change count
to something that fails as soon as >1 elements are identified.
回答1:
Using tpartition/4 in tandem with
if_/3 and (=)/3, we define remove_unique/2
like this:
remove_unique([], []). remove_unique([E|Xs0], Ys0) :- tpartition(=(E), Xs0, Es, Xs), if_(Es = [], Ys0 = Ys, append([E|Es], Ys, Ys0)), remove_unique(Xs, Ys).
Here's the sample query, as given by the OP:
?- remove_unique([1,2,2,3,4,5,7,6,7], Xs).
Xs = [2,2,7,7]. % succeeds deterministically
回答2:
As long as you don't know that the list is sorted in any way, and you want to keep the sequence of the non-unique elements, it seems to me you can't avoid making two passes: first count occurrences, then pick only repeating elements.
What if you use a (self-balancing?) binary tree for counting occurrences and look-up during the second pass? Definitely not O(n²), at least...
来源:https://stackoverflow.com/questions/15990666/remove-unique-elements-only