Multiple limit condition in mongodb

后端 未结 2 1735
清酒与你
清酒与你 2021-01-27 10:47

I have a collection in which one of the field is \"type\". I want to get some values of each type depending upon condition which is same for all the types. Like I want 2 documen

相关标签:
2条回答
  • 2021-01-27 11:19

    You will not be able to do this directly with only the type column and the constraint that it must be one query. However there is (as always) a way to accomplish this.

    To find documents of different types, you would need to have some type of additional value that, on average distributed the types out according to how you want the data back.

    db.users.insert({type: 'A', index: 1})
    db.users.insert({type: 'B', index: 2})
    db.users.insert({type: 'A', index: 3})
    db.users.insert({type: 'B', index: 4})
    db.users.insert({type: 'A', index: 5})
    db.users.insert({type: 'B', index: 6})
    

    Then when querying for items with db.users.find(index: {$gt: 2, $lt: 7}) you will have the right distribution of items.

    Though I'm not sure this was what you were looking for

    0 讨论(0)
  • 2021-01-27 11:36

    Generally what you are describing is a relatively common question around the MongoDB community which we could describe as the "top n results problem". This is when given some input that is likely sorted in some way, how to get the top n results without relying on arbitrary index values in the data.

    MongoDB has the $first operator which is available to the aggregation framework which deals with the "top 1" part of the problem, as this actually takes the "first" item found on a grouping boundary, such as your "type". But getting more than "one" result of course gets a little more involved. There are some JIRA issues on this about modifying other operators to deal with n results or "restrict" or "slice". Notably SERVER-6074. But the problem can be handled in a few ways.

    Popular implementations of the rails Active Record pattern for MongoDB storage are Mongoid and Mongo Mapper, both allow access to the "native" mongodb collection functions via a .collection accessor. This is what you basically need to be able to use native methods such as .aggregate() which supports more functionality than general Active Record aggregation.

    Here is an aggregation approach with mongoid, though the general code does not alter once you have access to the native collection object:

    require "mongoid"
    require "pp";
    
    Mongoid.configure.connect_to("test");
    
    class Item
      include Mongoid::Document
      store_in collection: "item"
    
      field :type, type: String
      field :pos, type: String
    end
    
    Item.collection.drop
    
    Item.collection.insert( :type => "A", :pos => "First" )
    Item.collection.insert( :type => "A", :pos => "Second"  )
    Item.collection.insert( :type => "A", :pos => "Third" )
    Item.collection.insert( :type => "A", :pos => "Forth" )
    Item.collection.insert( :type => "B", :pos => "First" )
    Item.collection.insert( :type => "B", :pos => "Second" )
    Item.collection.insert( :type => "B", :pos => "Third" )
    Item.collection.insert( :type => "B", :pos => "Forth" )
    
    res = Item.collection.aggregate([
      { "$group" => {
          "_id" => "$type",
          "docs" => {
            "$push" => {
              "pos" => "$pos", "type" => "$type"
            }
          },
          "one" => {
            "$first" => {
              "pos" => "$pos", "type" => "$type"
            }
          }
      }},
      { "$unwind" =>  "$docs" },
      { "$project" => {
        "docs" => {
          "pos" => "$docs.pos",
          "type" => "$docs.type",
          "seen" => {
            "$eq" => [ "$one", "$docs" ]
          },
        },
        "one" => 1
      }},
      { "$match" => {
        "docs.seen" => false
      }},
      { "$group" => {
        "_id" => "$_id",
        "one" => { "$first" => "$one" },
        "two" => {
          "$first" => {
            "pos" => "$docs.pos",
            "type" => "$docs.type"
          }
        },
        "splitter" => {
          "$first" => {
            "$literal" => ["one","two"]
          }
        }
      }},
      { "$unwind" => "$splitter" },
      { "$project" => {
        "_id" => 0,
        "type" => {
          "$cond" => [
            { "$eq" => [ "$splitter", "one" ] },
            "$one.type",
            "$two.type"
          ]
        },
        "pos" => {
          "$cond" => [
            { "$eq" => [ "$splitter", "one" ] },
            "$one.pos",
            "$two.pos"
          ]
        }
      }}
    ])
    
    pp res
    

    The naming in the documents is actually not used by the code, and titles in the data shown for "First", "Second" etc, are really just there to illustrate that you are indeed getting the "top 2" documents from the listing as a result.

    So the approach here is essentially to create a "stack" of the documents "grouped" by your key, such as "type". The very first thing here is to take the "first" document from that stack using the $first operator.

    The subsequent steps match the "seen" elements from the stack and filter them, then you take the "next" document off of the stack again using the $first operator. The final steps in there are really justx to return the documents to the original form as found in the input, which is generally what is expected from such a query.

    So the result is of course, just the top 2 documents for each type:

    { "type"=>"A", "pos"=>"First" }
    { "type"=>"A", "pos"=>"Second" }
    { "type"=>"B", "pos"=>"First" }
    { "type"=>"B", "pos"=>"Second" }
    

    There was a longer discussion and version of this as well as other solutions in this recent answer:

    Mongodb aggregation $group, restrict length of array

    Essentially the same thing despite the title and that case was looking to match up to 10 top entries or greater. There is some pipeline generation code there as well for dealing with larger matches as well as some alternate approaches that may be considered depending on your data.

    0 讨论(0)
提交回复
热议问题