Prolog - DCG parser with input from file

前端 未结 4 1804
萌比男神i
萌比男神i 2021-02-19 20:11

As part of a project I need to write a parser that can read a file and parse into facts I can use in my program.

The file structure looks as follows:

p         


        
4条回答
  •  一生所求
    2021-02-19 20:54

    IMHO, DCG grammar rules are quite ugly at tokenizing, I really thing DCG should have never been even proposed for that task; the real deal with DCG is to parse the tokens, because prolog uses symbolics, so I may say that, the best option is to create a foreign call to a, say C tokenizer which will unify with the plain list of tokens and then let DCG do what it has been though for. This way the implementation is cleaner and you don't have to worry about parsing cr, blanks...

    Say you have an hypothetical language which has an statement that looks as follows:

    object:
           object in a yields b,
           object in b yields C.
    

    I don't want to even imagine in tokenizing this in DCG; I am too lazy to learn how to do so with a tool which have not being designed for such a task. Better would be to have a foreign call to a predicate that will provide me with the plain list of tokens.

     tokenize(A,ListOfTokens), phrase(yourDGCstartRule(Information), ListOfTokens, _).
    

    The list for our running example will look simply as:

    ListOfTokens = [object,:,object,in,a,yields,b,',',object,in,b,yields,c].
    

    I think this is way much more elegant and your rules maps accordingly. I could be wrong in my thoughs but at the end it's a matter of taste, and to mine, DCG is not a tokenizer and I would never use it for that unless is strictly required. Admitedly I can spot some applications where it would make sense to use it also as tokenizer but still I think the tasks should be separated.

    Please notice that I am NOT saying that prolog doesn't have good facilities, you could always do tokenizing in prolog but you should separate the tasks and let DCG deal only with symbols and some other stricly needed characters or strings (as Uppercase strings, like proper names or other characters).

    Finally it seems to me that people might has forgotten that tokenizing and parsing are two separated tasks; more in prolog, since tokens are symbols which is what prolog is good at, and parsing tokens/symbols (not characters) what DCG does better, for as embeeded semantics interfaces prolog which is the desirable scenario.

提交回复
热议问题