MongoDB - Error “Too many results for query, truncating output” with $geoNear

给你一囗甜甜゛ 提交于 2019-12-13 17:21:27

问题


I'm running a $geoNear query on my sharded cluster (6 nodes with 3 replica sets each of 2 shardsvr and 1 arbiter). I expect the query to return 1.1m documents. I am recieving only ~130.xxx documents. I am using the Java driver to issue the query and process the data (for now, I'm just counting the documents that get returned). I am using MongoDB 3.2.9 and the latest java driver.

The mongod log shows the following error which is caused by the output document getting larger than 16MB:

2016-10-10T12:00:22.933+0200 W COMMAND  [conn22] Too many geoNear results for query { location: { $nearSphere: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx] }, $maxDistance: 3900.0 } }, truncating output.
2016-10-10T12:00:22.951+0200 I COMMAND  [conn22] command mydb.data command: geoNear { geoNear: "data", near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] }, 
    num: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: true } keyUpdates:0 writeConflicts:0 numYields:890 reslen:16777310 
    locks:{ Global: { acquireCount: { r: 1784 } }, Database: { acquireCount: { r: 892 } }, Collection: { acquireCount: { r: 892 } } } protocol:op_query 589ms

2016-10-10T12:00:23.183+0200 I COMMAND  [conn22] getmore mydb.data query: { aggregate: "data", pipeline: [ { $geoNear: { near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] }, 
    distanceField: "dist.calculated", limit: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: "dist.location" } }, { $project: { _id: false, 
    dist: { calculated: true } } } ], fromRouter: true, cursor: { batchSize: 0 } } cursorid:170255616227 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:43558 
    reslen:1568108 locks:{ Global: { acquireCount: { r: 1786 } }, Database: { acquireCount: { r: 893 } }, Collection: { acquireCount: { r: 893 } } } 820ms

The Query:

db.data.aggregate([
   {
      $geoNear:{
         near:{
            type:"Point",
            coordinates:[
               10.xxxx,
               52.xxxxx
            ]
         },
         distanceField:"dist.calculated",
         maxDistance:3900,
         num:50000000,
         includeLocs:"dist.location",
         spherical:true
      }
   }
])

Note that I issued the query with and without the parameter num, both fail with the error shown above.

I expected the query to return chunks of the database once the document size limit (16 MB) gets exceeded. What am I missing? How can I retrieve all the data?

Edit: The query also fails with the same error in the mongod logs when I add a group stage:

db.data.aggregate([
   {
      $geoNear:{
         near:{
            type:"Point",
            coordinates:[
               10.xxxx,
               52.xxxxxx
            ]
         },
         distanceField:"dist.calculated",
         maxDistance:3900,
         includeLocs:"dist.location",
         num:2000000,
         spherical:true
      }
   },
   {
      $group:{
         _id:"$root_document"
      }
   }
])

回答1:


MongoDB Staff member Lungang Fang has answered to my enquiry on the MongoDB user group in the meantime. Below is his answer:

Currently, the “geoNear” aggregation stage is limited to return results that are within the 16MB BSON size limit. This is related to an issue with earlier version of MongoDB (which is described in https://jira.mongodb.org/browse/SERVER-13486). Your query hit this issue because “geoNear” returns a single document (contains an array of result documents) and the “allowDiskUse” aggregation pipeline option unfortunately does not help in this case.

There are two options that could be considered:

If you don’t need all the results, you could limit the “geoNear” aggregation result size using num, limit, or maxDistance options If you require all of the results, you can use the find() operator which is not limited to the BSON maximum size since it returns a cursor. Below is a test I done on MongoDB 3.2.10 For your information.

Create “2dsphere” for designated collection: db.coll.createIndex({location: '2dsphere'}) Create and insert several big documents:
var padding = ''; for (var j = 0; j < 15; j++) { for (var i = 1024*128; i > 0; --i) { var padding = padding + '12345678'; } }

 db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding}) Query using “geoNear” and server log shows “Too many geoNear results …, truncating output”
 db.coll.aggregate(
     [
         {
             $geoNear:{
                 near:{type:"Point", coordinates:[-73.86, 40.73]},
                 distanceField:"dist.calculated",
                 maxDistance:150000000,
                 spherical:true
             }
         },
         {$project: {location:1}}
     ]
 ) Query using “find” and all expected documents are returned
 // This and following "var" are necessary to avoid the screen being flushed by padding string.
 var cursor = db.coll.find (
     {
         location: {
             $near: {
                 $geometry:{type:"Point", coordinates:[-73.86, 40.73]},
                 maxDistance:150000,
             }
         }
     }
 )

 // It is necessary to iterate through the cursor. Otherwise, the query is not actually executed.
 var x = cursor.next()
 x._id
 var x = cursor.next()
 x._id
 ... 

Regards, Lungang



来源:https://stackoverflow.com/questions/39956171/mongodb-error-too-many-results-for-query-truncating-output-with-geonear

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!