Reading data from specially formatted text file

后端 未结 2 1717
日久生厌
日久生厌 2021-01-15 12:46

I am using this method, kindly suggested by Ashwini Chaudhary, to assign data to a dictionary from a text file that is in a specific format.

keys = map(str.s         


        
相关标签:
2条回答
  • 2021-01-15 13:00
    ss = '''LineHere  w    x    y    z
    Key       a 1  b 2  c 3  d 4
    OrHere    00   01   10   11
    Word      as   box  cow  dig
    '''
    import re
    
    rgx = re.compile('Key +(.*)\r?\n'
                     '(?:.*\r?\n)?'
                     '(?:Word|Letter) +(.*)\r?\n')
    
    mat = rgx.search(ss)
    keys = mat.group(1).split(' ')
    words = mat.group(2).split('\t')
    

    You'll obtain ss by reading your file:

    with open (filename) as f:
        ss = f.read()
    

    Edit

    Well, if all the lines have data separated with tabs, you can do:

    ss = '''LineHere  w\tx\ty\tz
    Key       a 1\tb 2\tc 3\td 4
    OrHere    00\t01\t10\t11
    Word      as\tbox\tcow\tdig
    '''
    import re
    
    rgx = re.compile('Key +(.*)\r?\n'
                     '(?:.*\r?\n)?'
                     '(?:Word|Letter) +(.*)\r?\n')
    
    print  dict(zip(*map(lambda x: x.split('\t'),
                         rgx.search(ss).groups())))
    
    0 讨论(0)
  • 2021-01-15 13:12

    Something like this:

    import re
    with open('abc') as f:
        for line in f:
            if line.startswith('Key'):
                keys = re.search(r'Key\s+(.*)',line).group(1).split("\t")
            elif line.startswith(('Word','Letter')):
                vals = re.search(r'(Word|Letter)\s+(.*)',line).group(2).split("\t")
    
        print dict(zip(keys,vals))
    

    abc:

    LineHere  w    x    y    z
    Key       a 1  b 2  c 3  d 4
    OrHere    00   01   10   11
    Word      as   box  cow  dig
    

    output is :

    {'d 4': 'dig', 'b 2': 'box', 'a 1': 'as', 'c 3': 'cow'}
    

    abc:

    LineHere  w    x    y    z
    Key       a 1  b 2  c 3  d 4
    OrHere    00   01   10   11
    Letter    A    B    C    D
    

    output is :

    {'d 4': 'D', 'b 2': 'B', 'a 1': 'A', 'c 3': 'C'}
    
    0 讨论(0)
提交回复
热议问题