C# more efficient way of comparing two collections

女生的网名这么多〃 提交于 2021-02-07 06:26:10

问题


I have two collections

List<Car> currentCars = GetCurrentCars();
List<Car> newCars = GetNewCars();

I don't want to use foreach loop or something because i think there should be much better way of doing this.

I am looking for more efficient way to compare this collections and to get results:

  1. List of cars which are in newCars and not in currentCars
  2. List of cars which are not in newCars and in currentCars

Type Car has int property Id.

There was an answer, which is already deleted saying What i mean by saying efficient: less code, less mechanics, and more readable cases

So thinking this way what is the cases i have?

What would be less code, less mechanics, and more readable cases?


回答1:


You can use Except:

var currentCarsNotInNewCars = currentCars.Except(newCars);
var newCarsNotInCurrentCars = newCars.Except(currentCars);

But this has no performance benefit over the foreach solution. It just looks cleaner.
Also, be aware of the fact, that you need to implement IEquatable<T> for your Car class, so the comparison is done on the ID and not on the reference.

Performancewise, a better approach would be to not use a List<T> but a Dictionary<TKey, TValue> with the ID as the key:

var currentCarsDictionary = currentCars.ToDictionary(x => x.ID);
var newCarsDictionary = newCars.ToDictionary(x => x.ID);

var currentCarsNotInNewCars = 
    currentCarsDictionary.Where(x => !newCarsDictionary.ContainsKey(x.Key))
                         .Select(x => x.Value);

var newCarsNotInCurrentCars = 
    newCarsDictionary.Where(x => !currentCarsDictionary.ContainsKey(x.Key))
                     .Select(x => x.Value);



回答2:


You can do it like this:

// 1) List of cars in newCars and not in currentCars
var newButNotCurrentCars = newCars.Except(currentCars);

// 2) List of cars in currentCars and not in newCars
var currentButNotNewCars = currentCars.Except(newCars);

The code uses the Enumerable.Except extension method (available in .Net 3.5 and over).

I believe this fulfills your criteria of "less code, less mechanics, and more readable".




回答3:


If you start with them in HashSets you can use the Except method.

HashSet<Car> currentCars = GetCurrentCars();
HashSet<Car> newCars = GetNewCars();

currentCars.Except(newCars);
newCars.Except(currentCars);

It would be much faster w/ a set than a list. (Under the hood a list is just doing a foreach, sets can be optimized).




回答4:


You can use LINQ...

        List<Car> currentCars = new List<Car>();
        List<Car> newCars = new List<Car>();

        List<Car> currentButNotNew = currentCars.Where(c => !newCars.Contains(c)).ToList();
        List<Car> newButNotCurrent = newCars.Where(c => !currentCars.Contains(c)).ToList();

...but do not be fooled. It may be less code for you, but there will definitely be some for loops in there somewhere

EDIT: Didn't realise there was an Except method :(




回答5:


I'd override the Equals of a Car to compare by id and then you could use the IEnumerable.Except extension method. If you can't override the Equals you can create your own IEqualityComparer<Car> which compares two cars by id.

class CarComparer : IEqualityComparer<Car>
{
    public bool Equals(Car x, Car y)
    {
        return x != null && y != null && x.Id == y.Id;
    }

    public int GetHashCode(Car obj)
    {
        return obj == null ? 0 : obj.Id;
    }
}



回答6:


If you're looking for efficency, implement IComparable on Cars (sorting on your unique ID) and use a SortedList. You can then walk through your collections together and evaluate your checks in O(n). This of course comes with an added cost to List inserts to maintain the sorted nature.




回答7:


You can copy the smaller list into an hash table based collection like HashSet or Dictionary and then iterate over the second list and check if the item exists in the hash table.

this will reduce the time from O(N^2) in the naive foreach inside foreach case to O(N).

This is the best you can do without knowing more about the lists (you may be able to do a little better if the lists are sorted for example, but, since you have to "touch" each car at least once to check if it's on the new car list you can never do better than O(N))




回答8:


If a comparison of the Id property will suffice you to say if a Car is equal to another, in order to avoid some sort of loop, you could override the List with your own class that keeps track of the items and uses the IEqualityComparer on the entire collection, like this:

class CarComparer : IList<Car>, IEquatable<CarComparer>
{
    public bool Equals(CarComparer other)
    {
        return object.Equals(GetHashCode(),other.GetHashCode());
    }

    public override int GetHashCode()
    {
        return _runningHash;
    }

    public void Insert(int index, Car item)
    {
        // Update _runningHash here
        throw new NotImplementedException();
    }

    public void RemoveAt(int index)
    {
        // Update _runningHash here
        throw new NotImplementedException();
    }

    // More IList<Car> Overrides ....
}

Then, you just need to override the Add, Remove, etc and any other methods that might affect the items in the list. You can then keep a private variable that is a hash of some sort of the Ids of the items in the list. When overriding your Equals methods you can then just compare this private variable. Not the cleanest approach by far (as you have to keep up with your hash variable), but it will result in you not having to loop through to do a comparison. If it were me, I would just use Linq as some have mentioned here...



来源:https://stackoverflow.com/questions/6680487/c-sharp-more-efficient-way-of-comparing-two-collections

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!