Simple MongoDB query very slow although index is set

前端未结

关注

 3  523

I\'ve got a MongoDB collection that holds about 100M documents.

The documents basically look like this:

_id             : ObjectId(\"asd1234567890\")
_re


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2021-01-21 05:03
              
            
            
                                                                       
You don't have any index that mongo will automatically use for that, so it's doing a full table scan.

As mentioned in the docs


  If the first key [of the index] is not present in the query, the index will only be used if hinted explicitly. 


Why

If you have an index on a,b - and you search by a alone -  an index will automatically be used. This is because it's the start of the index (which is fast to do), the db can just ignore the rest of the index value. 

An index on a,b is inefficient when searching by b alone simply because it doesn't give the possibility to use the index searching with "starts with thisfixedstring".

So, either:


Include _reference_1_id in the query (probably irrelevant)
OR add an index on _reference_2_id (if you query by the field often)
OR use a hint


Hint

Probably your lowest-cost option right now.

Add a query hint to force using your _reference_1_id_1__reference_2_id_1_id_1 index. Which is likely to be a lot faster than a full table scan, but still a lot slower than an index which starts with the field you are using in the query.

i.e. 

db.mycoll
    .find({"_reference_2_id" : ObjectId("jkl7890123456")})
    .hint("_reference_1_id_1__reference_2_id_1_id_1");

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2021-01-21 05:06
              
            
            
                                                                       
I would try setting a non-unique index on _reference_2_id, because at the moment, I suspect you'll be doing the equivalent of a full table scan as even though the indexes contain _reference_2_id, they won't be used (see here).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清酒与你        
                
              
                            
                2021-01-21 05:06
              
            
            
                                                                       
Hye,
I've quiet the same problem on an equivalent amount of datas. In the documentation, it's written that queries with index must fit in ram. I think this is not the case, the query must be doing a lot of disk access to first retrieve the index and then get the value. In your case, a direct collection read will be faster.

EV.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复