pymongo

pymongo error when writing

假如想象 提交于 2019-12-21 04:12:25
问题 I am unable to do any writes to a remote mongodb database. I am able to connect and do lookups (e.g. find). I connect like this: conn = pymongo.MongoClient(db_uri,slaveOK=True) db = conn.test_database coll = db.test_collection But when I try to insert, coll.insert({'a':1}) I run into an error: --------------------------------------------------------------------------- AutoReconnect Traceback (most recent call last) <ipython-input-56-d4ffb9e3fa79> in <module>() ----> 1 coll.insert({'a':1})

pymongo: remove duplicates (map reduce?)

醉酒当歌 提交于 2019-12-21 04:05:06
问题 I do have a Database with several collections (overall ~15mil documents) and documents look like this (simplified): {'Text': 'blabla', 'ID': 101} {'Text': 'Whuppppyyy', 'ID': 102} {'Text': 'Abrakadabraaa', 'ID': 103} {'Text': 'olalalaal', 'ID': 104} {'Text': 'test1234545', 'ID': 104} {'Text': 'whapwhapwhap', 'ID': 104} They all have an unique _id field as well, but I want to delete duplicates accodring to another field (the external ID field). First, I tried a very manual approach with lists

Mongodb bulk write error

*爱你&永不变心* 提交于 2019-12-21 03:28:29
问题 I'm executing bulk write bulk = new_packets.initialize_ordered_bulk_op() bulk.insert(packet) output = bulk.execute() and getting an error that I interpret to mean that packet is not a dict. However, I do know that it is a dict. What could be the problem? Here is the error: BulkWriteError Traceback (most recent call last) <ipython-input-311-93f16dce5714> in <module>() 2 3 bulk.insert(packet) ----> 4 output = bulk.execute() C:\Users\e306654\AppData\Local\Continuum\Anaconda\lib\site-packages

Why does PyMongo throw AutoReconnect?

冷暖自知 提交于 2019-12-21 03:25:24
问题 While researching some strange issues with my Python web application (in particular, issues regarding MongoDB connectivity), I noticed something on the official PyMongo documentation page. My web application uses Flask, but this shouldn't influence the issue I'm facing. The PyMongo driver does connection pooling, but it also throws an exception ( AutoReconnect ) when a connection is stale and a reconnect is due. It states that (regarding the AutoReconnect exception): In order to auto

pymongo操作MongoDB

試著忘記壹切 提交于 2019-12-21 02:48:48
目录 pymongo操作MongoDB 安装,启动及链接 MongoDB pymongo 连接MongoDB,指定数据库,指定集合 插入数据 查询 普通查询 条件查询 计数 排序 偏移 更新 删除 其他操作 pymongo操作MongoDB 安装,启动及链接 MongoDB 返回目录 官方网站: https://www.mongodb.com 官方文档: https://docs.mongodb.com GitHub: https://github.com/mongodb 中文教程: http://www.runoob.com/mongodb/mongodb-tutorial.html 16.04安装: 导入MongoDB的GPG key: sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6 创建apt-get源列表: echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4

Ridiculously slow mongoDB query on small collection in simple but big database

不问归期 提交于 2019-12-21 02:28:08
问题 So I have a super simple database in mongoDB with a few collections: > show collections Aggregates <-- count: 92 Users <-- count: 68222 Pages <-- count: 1728288847, about 1.1TB system.indexes The Aggregates collection is an aggregate of the Pages collection, and each document looks like this: > db.Aggregates.findOne() { "_id" : ObjectId("50f237126ba71610eab3aaa5"), "daily_total_pages" : 16929799, "day" : 21, "month" : 9, "year" : 2011 } Very simple. However, let's try and get the total page

用Python操作MongoDB(pymongo)

青春壹個敷衍的年華 提交于 2019-12-20 23:40:55
MongoDB简介 (摘自: http://www.runoob.com/mongodb/mongodb-intro.html ) MongoDB 由C++语言编写,是一个基于分布式文件存储的开源数据库系统。 MongoDB 将数据存储为一个文档,数据结构由键值对(key=>value)组成,类似于 JSON 对象。 MongoDB 属于NoSQL,NoSQL即Not Only SQL,意思是"不仅仅是SQL",泛指非关系型数据库。 (关系数据库管理系统(RDBMS)与非关系型数据库(NoSQL)之间的区别请见: https://www.cnblogs.com/HuZihu/p/10233242.html ) MongoDB的一些基本术语 SQL术语/概念 MongoDB术语/概念 解释/说明 database database 数据库 table collection 数据库表/集合 row document 数据记录行/文档 column field 数据字段/域 index index 索引 table joins 表连接,MongoDB不支持 primary key primary key 主键,MongoDB自动将_id字段设置为主键 使用Python操作MongoDB 接下来我们用python来操作MongDB,首先需要安装 PyMongo库 (pip install

How do I compare dates from Twitter data stored in MongoDB via PyMongo?

若如初见. 提交于 2019-12-20 14:06:13
问题 Are the dates stored in the 'created_at' fields marshaled to Python datetime objects via PyMongo, or do I have to manually replace the text strings with Python Date objects? i.e. How do I convert a property in MongoDB from text to date type? It seems highly unnatural that I would have to replace the date strings with Python date objects, which is why I'm asking the question. I would like to write queries that display the tweets from the past three days. Please let me know if there is a slick

MongoLab/PyMongo connection error

陌路散爱 提交于 2019-12-20 10:44:03
问题 If I run in the shell: mongo ds0219xx.mlab.com:219xx/dbname -u user -p pass It works and allows me to connect to the database and pull information. But if I'm within my python application (Flask) and run this: import pymongo client = pymongo.MongoClient("mongodb://user:pass@ds0219xx.mlab.com:219xx/dbname") db = client["dbname"] db.users.insert_one({ "user1": "hello" }) It gives me an: pymongo.errors.OperationFailure: Authentication failed. I'm pretty sure it's failing before it gets to the

Getting 'TypeError: ObjectId('') is not JSON serializable' when using Flask 0.10.1

烈酒焚心 提交于 2019-12-20 10:24:07
问题 I forked the Flask example, Minitwit, to work with MongoDB and it was working fine on Flask 0.9, but after upgrading to 0.10.1 I get the error in title when I login when I try to set the session id. It seems there was changes in Flask 0.10.1 related to json. Code snippet: user = db.minitwit.user.find_one({'username': request.form['username']}) session['_id'] = user['_id'] Full code in my github repo. Basically, I set the Flask session id to the user's _id from MongoDB. I tried the first two