问题
From my understanding, a simple for loop will have a complexity of O(n).
foreach(var record in records) {
// ...
}
If I introduce a Hash lookup within the foreach, will that keep the complexity to O(n)?
var ids = new HashSet<int>();
foreach(var record in records) {
bool isRecordInIdSet = ids.Contains(record.id);
}
Likewise, if the HashSet was instead a list, would that make the complexity to O(n^2)?
var ids = new List<int>();
foreach(var record in records) {
bool isRecordInIdSet = ids.Contains(record.id);
}
I was gleaning from this question What is the lookup time complexity of HashSet<T>(IEqualityComparer<T>)? , and would love to make sure my understanding is correct. Please add any additional info that you think would be useful to understanding Big-O terminology.
回答1:
Your interpretation is correct, hash look-up is O(1), so repeating it n times give you an O(n) complexity.
Hash lookup is constant time (i.e. O(1)) for objects with properly implemented hash code. Since you are using an int
, for which the hash code is the number itself, you get a very good implementation, ensuring constant-time access.
if the
HashSet
was instead a list, would that make the complexity to O(n2)?
Technically, it would be O(n*m), where m
represents the number of items in the id
list.
回答2:
your analysis of list vs hashmap is correct.
var ids = new HashSet<int>();
foreach(var record in records) {
bool isRecordInIdSet = ids.Contains(record.id); // this is O(1)
}
hashtables has a constant lookup time ( O(1) ). So that for loop has time complexity O(n), where n is the number of records
var ids = new List<int>();
foreach(var record in records) {
bool isRecordInIdSet = ids.Contains(record.id); // this is O(m), where m is the size of list ids
}
when using a list, your .Contains method has O(n) time because in the worst case the item you are looking for is at the very end of the list, so you must traverse the whole list and make n-1 comparisons. In practice, you will not have to do this everytime, but remember that Big-O notation is concerned with the worst case scenario. If the size of ids list and records list were dependent on the same variable, then this for loop has O(n^2). Otherwise, it would be O(n * m) where n is the size of records and m is the size of ids.
来源:https://stackoverflow.com/questions/43665178/big-o-notation-correct-definition-for-a-loop-with-hashset-lookup