Tips for reading in a complex file - Python

若如初见. 提交于 2019-11-28 02:29:59
ivan_pozdeev

It's a typical task for a syntactic analyzer. In this case, since

  • lexical constructs do not cross line boundaries and there's a single construct ("statement") per line. In other words, each line is a single statement
  • full syntax for a single line can be covered by a set of regexes
  • the structure of compounds (=entities connecting multiple "statements" into something bigger) is simple and straightforward

a (relatively) simple scannlerless parser based on lines, DFA and the aforementioned set of regexes can be applied:

  • set up the initial parser state (=current position relative to various entities to be tracked) and the parse tree (=data structure representing the information from the file in a convenient way)
  • for each line
    • classify it, e.g. by matching against the regexes applicable to the current state
    • use the matched regex's groups to get the line's statement's meaningful parts
    • using these parts, update the state and the parse tree

See get the path in a file inside {} by python for an example. There, I do not construct a parse tree (wasn't needed) but only track the current state.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!