MongoDB: Counting how many of each distinct values there are?

前端未结
关注
 1  1704
粉色の甜心 2020-12-11 22:45
I have a collection of documents holding a list of feedbacks for different items. It looks something like this:

      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   囚心锁ツ
                                             
                
                
                (楼主)
            
              
              
                2020-12-11 23:24
              

            
            
                        
You were very close, but of course $eq just returns a true/false value, so to make that numeric you need $cond:

db.collection(collectionName).aggregate([
  { "$group" : {
       "_id": "$item",
       "good_count": { 
           "$sum": { 
               "$cond": [ { "$eq": [ "$rating",  "good" ] }, 1, 0] 
           }
       },
       "neutral_count":{
           "$sum": { 
               "$cond": [ { "$eq": [ "$rating", "neutral" ] }, 1, 0 ]
            }
       },
       "bad_count": { 
           "$sum": { 
               "$cond": [ { "$eq": [ "$rating",  "bad" ] }, 1, 0 ]
           }
       }
  }}
])


As a "ternary" operator $cond takes a logical condition as it's first argument (if) and then returns the second argument where the evaluation is true (then) or the third argument where false (else). This makes true/false returns into 1 and 0 to feed to $sum respectively.

Also note that "case" is sensitive for $eq. If you have varing case then you likely want $toLower in the expressions:

               "$cond": [ { "$eq": [ { "$toLower": "$rating" },  "bad" ] }, 1, 0 ]




On a slightly different note, the following aggregation is usually more flexible to different possible values and runs rings around the conditional sums in terms of performance:

db.collection(collectionName).aggregate([
    { "$group": {
        "_id": { 
            "item": "$item",
            "rating": { "$toLower": "$rating" }
        },
        "count": { "$sum": 1 }
    }},
    { "$group": {
        "_id": "$_id.item",
        "results": {
            "$push": {
                "rating": "$_id.rating",
                "count": "$count"
            }
        }
    }}
])


That would instead give output like this:

{
    "_id": "item_1"
    "results":[
        { "rating": "good", "count": 12 },
        { "rating": "neutral", "count": 10 }
        { "rating": "bad", "count": 67 }
    ]
}


It's all the same information, but you did not have to explicitly match the values and it does execute much faster this way.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复