I have two lists ( not java lists, you can say two columns)
For example
**List 1** **Lists 2**
milan hafil
dingo
Are these really lists (ordered, with duplicates), or are they sets (unordered, no duplicates)?
Because if it's the latter, then you can use, say, a java.util.HashSet<E> and do this in expected linear time using the convenient retainAll.
List<String> list1 = Arrays.asList(
"milan", "milan", "iga", "dingo", "milan"
);
List<String> list2 = Arrays.asList(
"hafil", "milan", "dingo", "meat"
);
// intersection as set
Set<String> intersect = new HashSet<String>(list1);
intersect.retainAll(list2);
System.out.println(intersect.size()); // prints "2"
System.out.println(intersect); // prints "[milan, dingo]"
// intersection/union as list
List<String> intersectList = new ArrayList<String>();
intersectList.addAll(list1);
intersectList.addAll(list2);
intersectList.retainAll(intersect);
System.out.println(intersectList);
// prints "[milan, milan, dingo, milan, milan, dingo]"
// original lists are structurally unmodified
System.out.println(list1); // prints "[milan, milan, iga, dingo, milan]"
System.out.println(list2); // prints "[hafil, milan, dingo, meat]"
You can try intersection() and subtract() methods from CollectionUtils.
intersection()
method gives you a collection containing common elements and the subtract()
method gives you all the uncommon ones.
They should also take care of similar elements
Simple solution :-
List<String> list = new ArrayList<String>(Arrays.asList("a", "b", "d", "c"));
List<String> list2 = new ArrayList<String>(Arrays.asList("b", "f", "c"));
list.retainAll(list2);
list2.removeAll(list);
System.out.println("similiar " + list);
System.out.println("different " + list2);
Output :-
similiar [b, c]
different [f]
If you are looking for a handy way to test the equality of two collections, you can use org.apache.commons.collections.CollectionUtils.isEqualCollection
, which compares two collections regardless of the ordering.
Using java 8 removeIf
public int getSimilarItems(){
List<String> one = Arrays.asList("milan", "dingo", "elpha", "hafil", "meat", "iga", "neeta.peeta");
List<String> two = new ArrayList<>(Arrays.asList("hafil", "iga", "binga", "mike", "dingo")); //Cannot remove directly from array backed collection
int initial = two.size();
two.removeIf(one::contains);
return initial - two.size();
}
Of all the approaches, I find using org.apache.commons.collections.CollectionUtils#isEqualCollection
is the best approach. Here are the reasons -
If it's not possible to have apache.commons.collections
as a dependency, I would recommend to implement the algorithm it follows to check equality of the list because of it's efficiency.