问题
I take data from a search box and then insert into MongoDB as a document using the regular insert query. The data is stored in a collection for the word "cancer" in the following format with unique "_id".
{
"_id": {
"$oid": "553862fa49aa20a608ee2b7b"
},
"0": "c",
"1": "a",
"2": "n",
"3": "c",
"4": "e",
"5": "r"
}
Each document has a single word stored in the same format as above. I have many documents as such. Now, I want to remove the duplicate documents from the collection. I am unable to figure out a way to do that. Help me.
回答1:
an easy solution in mongo shell: `
use your_db
db.your_collection.createIndex({'1': 1, '2': 1, '3': 1, etc until you reach maximum expected letter count}, {unique: true, dropDups: true, sparse:true, name: 'dropdups'})
db.your_collection.dropIndex('dropdups')
notes:
- if you have many documents expect this procedure to take very long time
- be careful this will remove documents in place, better clone your collection first and try it there.
来源:https://stackoverflow.com/questions/29818667/mongodb-query-to-remove-duplicate-documents-from-a-collection