I have a MongoDb collection contains of 284.116 tweets. The problem is the \"author\" field in some objects are in object type, but in other objects -this \"author\" field-
Actually there is a "gotcha" listed in the documentation for $type specifically about arrays:
When applied to arrays, $type matches any inner element that is of the specified type. Without projection this means that the entire array will match if any element has the right type. With projection, the results will include just those elements of the requested type.
So that means that rather than detect whether the "element itself" is in array, what is actually being tested is the "inner element" of the array to see what type it is.
Now the documentation itself suggests this JavaScript test with $where:
.find({ "$where": "return Array.isArray(this.author)" })
But I think that's pretty horrible as there is a better way.
The trick is in "dot notation", where you ask for the 0
index element of the array to $exists
.find({ "author.0": { "$exists": true } })
Which is just the basic case that if the "0th" element exists then the field is present and the data is therefore an array.
Once you understand that logical premise then it is pretty simple test. The only thing that cannot be matched by that is a "truly empty" array, in which case you can fall back to the JavaScript alternative if needed. But this can actually use an index, so it would be preferred to use the latter form.
Here's a better way to do what you originally asked; that is to actually check if a certain field holds an array type value:
.find({ "author": { "$gte": [] } })
MongoDB's $type functionality for arrays, though well documented, is IMO inconsistent with all the other $type checks, and obviously doesn't work for this use case, but since around 2.6, you can use the above query to check whether the value is an array (empty or not).
I say this is "better" than the currently selected answer, because executing code via $where is not recommended, unless standard query constructs truly cannot get the job done.
To elaborate, $where is not recommended due to performance via lack of ability to use indexes in executed code. More detail: https://docs.mongodb.com/manual/reference/operator/query/where/#considerations
Also, if you'd like to check for non-empty arrays specifically, use this:
.find({ "author": { "$gt": [] } })
Technically, this one is also better than the current answer's corresponding $exists solution, as the field may have a non-array object with a field named "0", and that would match as a "non-empty array", which is wrong in that case.