pymongo

List comprehension with cursor from pymongo

旧时模样 提交于 2019-12-13 01:35:21
问题 Here is my pymongo code: client = MongoClient('localhost', 27017) db = client['somedb'] collection = db.somecollection return_obj = collection.find({"field1":"red"}) #First print statement print([item['field1'] for item in return_obj]) #Second print statement print([item['field1'] for item in return_obj]) The first print statement produces non-empty list, while the second one produces empty list. As if I have to reset an index on return_obj. Any ideas? 回答1: This is the correct behaviour, this

MongoDB - How to $slice a sub-subarray

假如想象 提交于 2019-12-13 00:37:01
问题 Given a document like { data:{ '2015':['a', 'b', 'c', ...], //<array of n datapoints> '2016':['d', 'e', 'f', ...], //<array of n datapoints> }, someOtherField: {...} } I am trying to query a slice of one of the arrays within data in the following way: db.collection.find({}, {'data.2015':{'$slice': [3, 5]}) The query returns the entire data field. Does that mean I can't $slice a sub-subarray? What would be the correct way to get a $slice of the data.2015 array? Solution db.collection.find({},

Locking a document in MongoDB

一世执手 提交于 2019-12-12 20:51:23
问题 I'm using pymongo in a web app, and want to do something of the form: doc = collection.find(document) doc.array1.append('foo') for(y in doc.array2): <do things with y> doc.array2 = filter(lambda x: ..., doc.array2) doc.x = len(doc.array2) collection.save(doc) Is there any simple way way I can handle multiple requests dealing with the same document and prevent one from clobbering the results of another / be made invalid because it's editing a stale version? 回答1: Take a look at the section in

ValueError: Extra Data error when importing json file using python

两盒软妹~` 提交于 2019-12-12 20:22:11
问题 I'm trying to build a python script that imports json files into a MongoDB. This part of my script keeps jumping to the except ValueError for larger json files. I think it has something to do with parsing the json file line by line because very small json files seem to work. def read(jsonFiles): from pymongo import MongoClient client = MongoClient('mongodb://localhost:27017/') db = client[args.db] counter = 0 for jsonFile in jsonFiles: with open(jsonFile, 'r') as f: for line in f: # load

使用Python分析《我不是药神》豆瓣电影短评

[亡魂溺海] 提交于 2019-12-12 20:16:52
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 小爬怡情,中爬伤身,强爬灰灰。爬虫有风险,使用请谨慎,可能是这两天爬豆瓣电影爬多了,今天早上登录的时候提示号被封了(我用自己帐号爬的,是找死呢还是在找死呢 ...),好在后面发完短信后又解封了,^_^。 之前的文章中,已把电影短评数据装进了Mongo中,今天把数据取出来简单分析一下,当下最火的做法是进行词频统计并生成词云,今天说的就是这个。 读取Mongo中的短评数据,进行中文分词 不知道什么原因,我实际爬下来的短评数据只有1000条(不多不少,刚刚好),我总觉得有什么不对,但我重复爬了几次后,确实只有这么多。可能是我爬虫写的有什么不对吧,文末附源码链接,有兴趣的去看看, 欢迎拍砖(轻拍)。 import pymongo import jieba from jieba import analyse # https://pypi.org/project/pymongo/ # http://github.com/mongodb/mongo-python-driver from matplotlib import pyplot from wordcloud import WordCloud text = None with pymongo.MongoClient(host='192.168.0.105', port

A way to ensure exclusive reads in MongoDb's findAndModify?

寵の児 提交于 2019-12-12 18:16:05
问题 I have a MongoDB collection (used as a job queue), from which multiple processes read records, using findAndModify . FindAndModify searches for records where the active field is "false", setting it to "true", so that other processes do not read the same record. The problem is that looking at logs, I see that different processes still read the same records. This seems to occur when two processes read from the queue at the same time. Is there any way to make sure that one only process reads

Listing users for certain DB with PyMongo

Deadly 提交于 2019-12-12 16:33:22
问题 What I'm trying to acheive I'm trying to fetch users for a certain database. What I did so far I was able to find function to list the databases or create users but none for listing the users, I thought about invoking an arbitrary command such as show users but I could find any way to do it. Current code #/usr/bin/python from pymongo import MongoClient client = MongoClient("localhost",27017) db = client.this_mongo Trial and error I can see the DB names and print them but nothing further: db

Check for existence of multiple fields in MongoDB document

不羁的心 提交于 2019-12-12 10:56:02
问题 I am trying to query a database collection that holds documents of processes for those documents that have specific fields. For simplicity imagine the following general document schema: { "timestamp": ISODate("..."), "result1": "pass", "result2": "fail" } Now, when a process is started a new document is inserted with only the timestamp. When that process reaches certain stages the fields result1 and result2 are added over time. Some processes however do not reach the stages 1 or 2 and

PyMongo: What happens to cursor when no_cursor_timeout=True

元气小坏坏 提交于 2019-12-12 10:55:44
问题 Looking at the cursor docs for MongoDB, I don't see a way to delete a cursor. What happens in PyMongo if I am using a cursor with the no_cursor_timeout property set to True? Is the cursor deleted when my script terminates even if I have not gotten to the end of the cursor's results? 回答1: Python uses reference counting for object lifetime management, when the Cursor object goes out of scope the garbage collector would call __die() which closes the cursor. If you want explicit control, you can

Mongoengine, retriving only some of a MapField

匆匆过客 提交于 2019-12-12 10:13:26
问题 For Example.. In Mongodb.. > db.test.findOne({}, {'mapField.FREE':1}) { "_id" : ObjectId("4fb7b248c450190a2000006a"), "mapField" : { "BOXFLUX" : { "a" : "f", } } } The 'mapField' field is made of MapField of Mongoengine. and 'mapField' field has a log of key and data.. but I just retrieved only 'BOXFLUX'.. this query is not working in MongoEngine.... for example.. BoxfluxDocument.objects( ~~ querying ~~ ).only('mapField.BOXFLUX') AS you can see.. only('mapField.BOXFLUX') or only only(