Building Nested dictionary in Python reading in line by line from file

后端未结

关注

 2  1829

The way I go about nested dictionary is this:

dicty = dict()
tmp = dict()
tmp[\"a\"] = 1
tmp[\"b\"] = 2
dicty[\"A\"] = tmp

dicty == {\"A\" : {\"a\" : 1, \"b


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  天涯浪人        
                
              
                            
                2021-01-16 12:03
              
            
            
                                                                       
While this isn't the ideal way to do things, you're pretty close to making it work.

Your main problem is that you're reusing the same tmp dictionary. After you insert it into dicty under the first key, you then clear it and start filling it with the new values. Replace tmp.clear() with tmp = {} to fix that, so you have a different dictionary for each key, instead of the same one for all keys.

Your second problem is that you're never storing the last tmp value in the dictionary when you reach the end, so add another dicty[oldword] = tmp after the for loop.

Your third problem is that you're checking if oldword is not "":. That may be true even if it's an empty string, because you're comparing identity, not equality. Just change that to if oldword:. (This one, you'll usually get away with, because small strings are usually interned and will usually share identity… but you shouldn't count on that.)

If you fix both of those, you get this:

{'FrontPage': {'frontpage': '0.710145', 'troubleshooting': '0.971014'},
 'proA': {'macbook': '0.666667', 'smart': '0.666667', 'ssd': '0.666667'}}


I'm not sure how to turn this into the format you claim to want, because that format isn't even a valid dictionary. But hopefully this gets you close.



There are two simpler ways to do it:


Group the values with, e.g., itertools.groupby, then transform each group into a dict and insert it all in one step. This, like your existing code, requires that the input already be batched by values[0].
Use the dictionary as a dictionary. You can look up each key as it comes in and add to the value if found, create a new one if not. A defaultdict or the setdefault method will make this concise, but even if you don't know about those, it's pretty simple to write it out explicitly, and it'll still be less verbose than what you have now.


The second version is already explained very nicely in Martijn Pieters's answer.

The first can be written like this:

def doubleDict(s):
    with open(filename, "r") as f:
        rows = (line.rstrip().split(" ") for line in f)
        return {k: {values[1]: values[2] for values in g}
                for k, g in itertools.groupby(rows, key=operator.itemgetter(0))}


Of course that doesn't print out the dict so far after every 25 rows, but that's easy to add by turning the comprehension into an explicit loop (and ideally using enumerate instead of keeping an explicit row counter).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2021-01-16 12:07
              
            
            
                                                                       
Use a collections.defaultdict() object to auto-instantiate nested dictionaries:

from collections import defaultdict

def doubleDict(filename):
    dicty = defaultdict(dict)
    with open(filename, "r") as f:
        for i, line in enumerate(f):
            outer, inner, value = line.split()
            dicty[outer][inner] = value
            if i % 25 == 0:
                print(dicty)
                break #print(row)
    return(dicty)


I used enumerate() to generate the line count here; much simpler than keeping a separate counter going.

Even without a defaultdict, you can let the outer dictionary keep the reference to the nested dictionary, and retrieve it again by using values[0]; there is no need to keep the temp reference around:

>>> dicty = {}
>>> dicty['A'] = {}
>>> dicty['A']['a'] = 1
>>> dicty['A']['b'] = 2
>>> dicty
{'A': {'a': 1, 'b': 1}}


All the defaultdict then does is keep us from having to test if we already created that nested dictionary. Instead of:

if outer not in dicty:
    dicty[outer] = {}
dicty[outer][inner] = value


we simply omit the if test as defaultdict will create a new dictionary for us if the key was not yet present.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复