mongodb-aggregation | 易学教程

MongoDB — Find duplicate documents by multiple keys

阅读更多关于 MongoDB — Find duplicate documents by multiple keys

问题 I have a collection with documents that look like the following: { "_id" : ObjectId("55b377cb66b393427367c3e2"), "comment" : "This is a comment", "url_key" : "55b377cb66b393427367c3df", //This is an ObjectId from another record in a different collection } I need to find records in this collection that contain duplicate values for the both the comment AND the url_key. I can easily generate (using aggregate) duplicate records for the same, single, key (eg: comment), but I can't figure out how

MongoDB — Find duplicate documents by multiple keys

阅读更多关于 MongoDB — Find duplicate documents by multiple keys

MongoDb $lookup query with multiple fields from objects array

阅读更多关于 MongoDb $lookup query with multiple fields from objects array

问题 This question has previously been marked as a duplicate of this question I can with certainty confirm that it is not. This is not a duplicate of the linked question because the elements in question are not an array but embedded in individual objects of an array as fields. I am fully aware of how the query in the linked question should work, however that scenario is different from mine. I have a question regarding the $lookup query of MongoDb. My data structure looks as follows: My "Event"

Query for latest version of a document by date in mongoDB

阅读更多关于 Query for latest version of a document by date in mongoDB

问题 I am trying to find a mongoDB script which will look at a collection where there are multiple records of the same document and only provide me with the latest version of each document as a result set. I cannot explain it in English any better than above but maybe this little SQL below might explain it further. I want each document by transaction_reference but only the latest dated version ( object_creation_date ). select t.transaction_reference, t.transaction_date, t.object_creation_date, t

mongo aggregation query in golang with mgo driver

阅读更多关于 mongo aggregation query in golang with mgo driver

问题 I have the following query in mongodb - db.devices.aggregate({ $match: {userId: "v73TuQqZykbxFXsWo", state: true}}, { $project: { userId: 1, categorySlug: 1, weight: { $cond: [ {"$or": [ {$eq: ["$categorySlug", "air_fryer"] }, {$eq: ["$categorySlug", "iron"] } ] }, 0, 1] } } }, {$sort: {weight: 1}}, { $limit : 10 } ); I'm trying to write this in golang using the mgo driver but not able to wrap my head around this at all! How do I translate this to a golang mgo query? 回答1: The examples on the

MongoDB aggregation performance capability

阅读更多关于 MongoDB aggregation performance capability

问题 I am trying to work through some performance considerations about using MongoDb for a considerable amount of documents to be used in a variety of aggregations. I have read that a collection has 32TB capcity depending on the sizes of chunk and shard key values. If I have 65,000 customers who each supply to us (on average) 350 sales transactions per day, that ends up being about 22,750,000 documents getting created daily. When I say a sales transaction, I mean an object which is like an invoice

MongoDb aggregation query with $group and $push into subdocument

阅读更多关于 MongoDb aggregation query with $group and $push into subdocument

问题 I have a question regarding the $group argument of MongoDb aggregations. My data structure looks as follows: My "Event" collection contains this single document: { "_id": ObjectId("mongodbobjectid..."), "name": "Some Event", "attendeeContainer": { "min": 0, "max": 10, "attendees": [ { "type": 1, "status": 2, "contact": ObjectId("mongodbobjectidHEX1") }, { "type": 7, "status": 4, "contact": ObjectId("mongodbobjectidHEX2") } ] } } My "Contact" collection contains these documents: { "_id":

Count Elements SubDocument that match a given criterion

阅读更多关于 Count Elements SubDocument that match a given criterion

问题 I have the following document structure in mongodb { "_id" : "123", "first_name" : "Lorem", "last_name" : "Ipsum", "conversations" : { "personal" : [ { "last_message" : "Hello bar", "last_read" : 1474456404 }, { "last_message" : "Hello foo", "last_read" : 1474456404 }, ... ], "group" : [ { "last_message" : "Hello Everyone", "last_read" : null } ... ] } } I want to count the number of conversations from the sub arrays, personal and group where the last_read is null, for a given user. Please

Save a subset of MongoDB(3.0) collection to another collection in Python

阅读更多关于 Save a subset of MongoDB(3.0) collection to another collection in Python

问题 I found this answer - Answer link db.full_set.aggregate([ { $match: { date: "20120105" } }, { $out: "subset" } ]); I want do same thing but with first 15000 documents in collection, I couldn't find how to apply limit to such query (I tried using $limit : 15000 , but it doesn't recognize $limit) also when I tried - db.subset.insert(db.full_set.find({}).limit(15000).toArray()) there is no function toArray() for output type cursor . Guide me how can I accomplish it? 回答1: Well, in python, this is

Finding most commonly used word in a string field throughout a collection

阅读更多关于 Finding most commonly used word in a string field throughout a collection

问题 Let's say I have a Mongo collection similar to the following: [ { "foo": "bar baz boo" }, { "foo": "bar baz" }, { "foo": "boo baz" } ] Is it possible to determine which words appear most often within the foo field (ideally with a count)? For instance, I'd love a result set of something like: [ { "baz" : 3 }, { "boo" : 2 }, { "bar" : 2 } ] 回答1: There was recently closed a JIRA issue about a $split operator to be used in the $project stage of the aggregation framework. With that in place you