Theory, examples of reversible parsers?

前端 未结 11 1179
滥情空心
滥情空心 2021-01-13 22:50

Does anyone out there know about examples and the theory behind parsers that will take (maybe) an abstract syntax tree and produce code, instead of vice-versa. Mathematicall

相关标签:
11条回答
  • 2021-01-13 23:11

    There are theory, working implementations and examples of reversible parsing in Haskell. The library is by Paweł Nowak. Please refer to https://hackage.haskell.org/package/syntax as your starting point. You can find the examples at following URLs.

    • https://hackage.haskell.org/package/syntax-example
    • https://hackage.haskell.org/package/syntax-example-json
    0 讨论(0)
  • 2021-01-13 23:12

    In addition to 'Visitor', 'unparser' is another good keyword to web-search for.

    0 讨论(0)
  • 2021-01-13 23:13

    Actually, generating code from a parse tree is strictly easier than parsing code, at least in a mathematical sense. There are many grammars which are ambiguous, that is, there is no unique way to parse them, but a parse tree can always be converted to a string in a unique way, modulo whitespace.

    The Dragon book gives a good description of the theory of parsers.

    0 讨论(0)
  • 2021-01-13 23:14

    Such thing is called a Visitor. Is traverses the tree and does whatever has to be done, for example optimize or generate code.

    0 讨论(0)
  • 2021-01-13 23:16

    Our DMS Software Reengineering Toolkit insists on parsers and parser-inverses (called "prettyprinters") as "poker-ante" to mechanical processing (analyzing/transforming) of arbitrary languages. These provide full round-trip: source text to ASTs with captured position information (file/line/column) and comments, and AST to legal source text including regenerating the original token positions ("fidelity printing") or nicely formatted ("prettyprinting") options, including regeneration of the comments.

    Parsers are often specified by a combination of grammars and lexical definitions of tokens; these notations are typically compiled into efficient parsing engines, and DMS does that for the "parser" side, as you might expect. Other folks here suggest that a "visitor" is the way to do prettyprinting, and, like assembly code, it is the right way to implement prettyprinting at the lowest level of abstraction. However, DMS prettyprinters are specified in terms of a text-box construction language over grammar terms something like Latex, that enables one to control the placement of the various language elements horizontally, vertically, embedded, spaced, concatenated, laminated, etc. DMS compiles these into efficient low-level visitors (as other answers suggest) that implement the box generation. But like the parser generator, you don't have see all the ugly detail.

    DMS has some 30+ sets of these language front ends for a various programming langauge and formal notations, ranging from C++, C, Java, C#, COBOL, etc. to HTML, XML, assembly languages from some machines, temporaral property specifications, specs for composable abstract algebras, etc.

    0 讨论(0)
  • 2021-01-13 23:17

    I don't know where to find much about the theory, but boost::spirit 2.0 has both qi (parser) and karma (generator), sharing the same underlying structure and grammar, so it's a practical implementation of the concept.

    Documentation on the generator side is still pretty thin (spirit2 was new in Boost 1.38, and is still in beta), but there are a few bits of karma sample code around, and AFAIK the library's in a working state and there are at least some examples available.

    0 讨论(0)
提交回复
热议问题