How to create a dictionary from a line of text?

前端未结

关注

 4  2115

I have a generated file with thousands of lines like the following:

CODE,XXX,DATE,20101201,TIME,070400,CONDITION_CODES,LTXT,PRICE,999.0000,QUANTITY,100,TSN,151


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2020-11-28 14:33
              
            
            
                                                                       
If we're going to abstract it into a function anyway, it's not too hard to write "from scratch":

def pairs(iterable):
    iterator = iter(iterable)
    while True:
        try: yield (iterator.next(), iterator.next())
        except: return


robert's recipe version definitely wins points for flexibility, though.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轻奢々        
                
              
                            
                2020-11-28 14:36
              
            
            
                                                                       
import itertools

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

record = dict(grouper(2, line.strip().split(","))


source
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  予麋鹿        
                
              
                            
                2020-11-28 14:37
              
            
            
                                                                       
Not so much better as just more efficient...

Full explanation
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  日久生厌        
                
              
                            
                2020-11-28 14:42
              
            
            
                                                                       
In Python 2 you could use izip in the itertools module and the magic of generator objects to write your own function to simplify the creation of pairs of values for the dict records. I got the idea for pairwise() from a similarly named (but functionally different) recipe in the Python 2 itertools docs.

To use the approach in Python 3, you can just use plain zip() since it does what izip() did in Python 2 resulting in the latter's removal from itertools — the example below addresses this and should work in both versions.

try:
    from itertools import izip
except ImportError:  # Python 3
    izip = zip

def pairwise(iterable):
    "s -> (s0,s1), (s2,s3), (s4, s5), ..."
    a = iter(iterable)
    return izip(a, a)


Which can be used like this in your file reading for loop:

from sys import argv

records = {}
for line in open(argv[1]):
    fields = (field.strip() for field in line.split(','))  # generator expr
    record = dict(pairwise(fields))
    records[record['TSN']] = record

print('Found %d records in the file.' % len(records))


But wait, there's more!

It's possible to create a generalized version I'll call grouper(), which again corresponds to a similarly named, but functionally different itertools recipe (which is listed right below pairwise()):

def grouper(n, iterable):
    "s -> (s0,s1,...sn-1), (sn,sn+1,...s2n-1), (s2n,s2n+1,...s3n-1), ..."
    return izip(*[iter(iterable)]*n)


Which could be used like this in your for loop:

    record = dict(grouper(2, fields))


Of course, for specific cases like this, it's easy to use functools.partial() and create a similar pairwise() function with it (which will work in both Python 2 & 3):

import functools
pairwise = functools.partial(grouper, 2)


Postscript

Unless there's a really huge number of fields, you could instead create a actual sequence out of the pairs of line items (rather than using a generator expression which has no len()):

fields = tuple(field.strip() for field in line.split(','))


The advantage being that it would allow the grouping to be done using simple slicing:

try:
    xrange
except NameError:  # Python 3
    xrange = range

def grouper(n, sequence):
    for i in xrange(0, len(sequence), n):
        yield sequence[i:i+n]

pairwise = functools.partial(grouper, 2)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复