Given: Connection is Safe=True so Update\'s return will contain update information.
Say I have a documents that look like:
[{\'a\': [1]}, {\'a\': [2]}, {
Ok since I read your question wrong all along it turns out that actually you are looking at two different queries and judging the time complexity between them.
The first query being:
coll.update({}, {'$addToSet': {'a':1}}, multi=True)
And the second being:
coll.update({'a': {'$ne': 1}}, {'$push': {'a':1}}, multi=True)
First problem springs to mind here, no indexes. $addToSet
, being an update modifier, I do not believe it uses an index as such you are doing a full table scan to accomplish what you need.
In reality you are looking for all documents that do not have 1
in a
already and looking to $push
the value 1
to that a
array.
So 2 points to the second query even before we get into time complexity here because the first query:
$addToSet
So I have pretty much made my mind up here that the second query is what your looking for before any of the Big O notation stuff.
There is a problem to using big O notation to explain the time complexity of each query here:
a
however not using indexes does not.However the first query would look something like: O(n) per document since:
Per collection, without the index it would be: O(2n2) since the complexity of iterating a
will expodentially increase with every new document.
The second query, without indexes, would look something like: O(2n2) (O(n) per document) I believe since $ne
would have the same problems as $addToSet
without indexes. However with indexes I believe this would actually be O(log n log n) (O(log n) per document) since it would first find all documents with a
in then all documents without 1
in their set based upon the b-tree.
So based upon time complexity and the notes at the beginning I would say query 2 is better.
If I am honest I am not used to explaining in "Big O" Notation so this is experimental.
Hope it helps,