Matching nonempty lines with pyparsing

前端 未结 2 908
攒了一身酷
攒了一身酷 2021-01-19 07:21

I am trying to make a small application which uses pyparsing to extract data from files produced by another program.

These files have following format.<

相关标签:
2条回答
  • 2021-01-19 07:43

    This takes you most of the way there:

    import pyparsing as pp
    
    data = """
    SOME_KEYWORD:
    line 1
    line 2
    line 3
    line 4
    
    ANOTHER_KEYWORD:
    line a
    line b
    line c
    """
    
    some_kw = pp.Keyword('SOME_KEYWORD:').suppress()
    another_kw = pp.Keyword('ANOTHER_KEYWORD:').suppress()
    kw = pp.Optional(some_kw ^ another_kw)
    
    # Hint from: http://pyparsing.wikispaces.com/message/view/home/21931601
    lines = kw + pp.SkipTo(
        pp.LineEnd() + pp.OneOrMore(pp.LineEnd()) |
        pp.LineEnd() + pp.StringEnd() |
        pp.StringEnd()
    )
    
    result = lines.searchString(data.strip())
    results_list = result.asList()
    # => [['\nline 1\nline 2\nline 3\nline 4'], ['\nline a\nline b\nline c']]
    

    When building a grammar it really helps to assign parts to variables and reference those when you can.

    0 讨论(0)
  • 2021-01-19 07:45

    My take on it:

        from pyparsing import *
    
        # matches and removes end of line
        EOL = LineEnd().suppress()
    
        # line starts, anything follows until EOL, fails on blank lines,
        line = LineStart() + SkipTo(LineEnd(), failOn=LineStart()+LineEnd()) + EOL
    
        lines = OneOrMore(line)
    
        # Group keyword probably helps grouping these items together, you can remove it
        parser = Keyword("SOME_KEYWORD:") + EOL + Group(lines) + Keyword("ANOTHER_KEYWORD:") + EOL + Group(lines)
        result = parser.parseFile('data.txt')
        print result
    

    Result is:

    ['SOME_KEYWORD:', ['line 1', 'line 2', 'line 3', 'line 4'], 'ANOTHER_KEYWORD:', ['line a', 'line b', 'line c']]
    
    0 讨论(0)
提交回复
热议问题