I\'m trying to use mongoimport to upsert data with string values in _id. Since the ids look like integers (even though they\'re in quotes), mongoimport treats them as intege
Unfortunately there is not now a way to force number-like strings to be interpreted as strings:
https://jira.mongodb.org/browse/SERVER-3731
You could write a script in Python or some other language with which you're comfortable, along the lines of:
import csv, pymongo
connection = pymongo.Connection()
collection = connection.mydatabase.mycollection
reader = csv.DictReader(open('myfile.csv'))
for line in reader:
print '_id', line['_id']
upsert_fields = {
'_id': line['_id'],
'my_other_upsert_field': line['my_other_upsert_field']}
collection.update(upsert_fields, line, upsert=True, safe=True)
Just encountered this same issue and discovered an alternative. You can force Mongo to use string types for non-string values by converting your CSV to JSON and quoting the field. For example, if your CSV looks like this:
key value
123 foo
abc bar
Then you'll get an integer field for key 123 and a string field for key abc. If you convert that to JSON, making sure that all the keys are quoted, and then use --type json
when you import, you'll end up with the desired behavior:
{
"123":"foo",
"abc":"bar"
}
I was able to prefix the numeric string and that worked for me. Example:
00012345 was imported as 12345 (Type Int) string00012345 was imported as string00012345 (Type String)
My source was a SQL database so I just did
select 'string'+column as name
Of course, you also need to do a bit of post-processing to parse the string, but far less effort than converting a rather large tsv file to json.
I also added +1 to the jira link above for the enhancement.
As an alternative to @Jesse, you can do something similar in the mongo console, e.g.
db.my_collection.find().forEach(function (obj) {
db.my_collection.remove({_id: obj._id); // remove the old one
obj._id = '' + obj._id; // change to string
db.my_collection.save(obj); // resave
});
For non _id
fields you can simply do:
db.my_collection.find().forEach(function (obj) {
obj.someField = '' + obj.someField; // change to string
db.my_collection.save(obj); // resave
});
I encountered the same issue.
I feel the simplest way is to convert the CSV file to a JSON file using an online tool and then import.
This is the tool I used:
http://www.convertcsv.com/csv-to-json.htm
It lets you wrap the integer values of your CSV file in double quotes for your JSON file.
If you have trouble importing this JSON file and encountering an error, just add --jsonArray to your import command. It will work for sure.
mongoimport --host localhost --db mydb -c mycollection --type json --jsonArray --file <file_path>