Efficient intersection of two List in Java?

后端 未结 8 554
南方客
南方客 2020-11-28 12:23

Question is simple:

I have two List

List columnsOld = DBUtils.GetColumns(db, TableName);
List columnsNew = DBUtils.GetCol         


        
相关标签:
8条回答
  • 2020-11-28 12:55

    Since retainAll won't touch the argument collection, this would be faster:

    List<String> columnsOld = DBUtils.GetColumns(db, TableName); 
    List<String> columnsNew = DBUtils.GetColumns(db, TableName); 
    
    for(int i = columnsNew.size() - 1; i > -1; --i){
        String str = columnsNew.get(i);
        if(!columnsOld.remove(str))
            columnsNew.remove(str);
    }
    

    The intersection will be the values left in columnsNew. Removing already compared values fom columnsOld will reduce the number of comparisons needed.

    0 讨论(0)
  • 2020-11-28 12:57

    Using Google's Guava library:

    Sets.intersection(Sets.newHashSet(setA), Sets.newHashSet(setB))
    

    Note: This is much more efficient than naively doing the intersection with two lists: it's O(n+m), versus O(n×m) for the list version. With two million-item lists it's the difference between millions of operations and trillions of operations.

    0 讨论(0)
  • 2020-11-28 13:00

    There is a nice way with streams which can do this in one line of code and you can two lists which are not from the same type which is not possible with the containsAll method afaik:

    columnsOld.stream().filter(c -> columnsNew.contains(c)).collect(Collectors.toList());
    

    An example for lists with different types. If you have a realtion between foo and bar and you can get a bar-object from foo than you can modify your stream:

    List<foo> fooList = new ArrayList<>(Arrays.asList(new foo(), new foo()));
    List<bar> barList = new ArrayList<>(Arrays.asList(new bar(), new bar()));
    
    fooList.stream().filter(f -> barList.contains(f.getBar()).collect(Collectors.toList());
    
    0 讨论(0)
  • 2020-11-28 13:03

    You can use retainAll method:

    columnsOld.retainAll (columnsNew);
    
    0 讨论(0)
  • 2020-11-28 13:06

    If you put the second list in a set say HashSet. And just iterate over the first list checking for presence on the set and removing if not present, your first list will eventually have the intersection you need. It will be way faster than retainAll or contains on a list. The emphasis here is to use a set instead of list. Lookups are O(1). firstList.retainAll (new HashSet (secondList)) will also work.

    0 讨论(0)
  • 2020-11-28 13:12

    use org.apache.commons.collections4.ListUtils#intersection

    0 讨论(0)
提交回复
热议问题