Efficient algorithm for converting a character set into a nfa/dfa

前端 未结 5 766
慢半拍i
慢半拍i 2021-02-14 06:29

I\'m currently working on a scanner generator. The generator already works fine. But when using character classes the algorithm gets very slow.

The scanner generator pr

5条回答
  •  遥遥无期
    2021-02-14 07:05

    I had the same problem with my scanner generator, so I've come up with the idea of replacing intervals by their ids which is determined using interval tree. For instance a..z range in dfa can be represented as: 97, 98, 99, ..., 122, instead I represent ranges as [97, 122], then build interval tree structure out of them, so at the end they are represented as ids that is referring to the interval tree. Given the following RE: a..z+, we end up with such DFA:

    0 -> a -> 1
    0 -> b -> 1
    0 -> c -> 1
    0 -> ... -> 1
    0 -> z -> 1
    
    1 -> a -> 1
    1 -> b -> 1
    1 -> c -> 1
    1 -> ... -> 1
    1 -> z -> 1
    1 -> E -> ACCEPT
    

    Now compress intervals:

    0 -> a..z -> 1
    
    1 -> a..z -> 1
    1 -> E -> ACCEPT
    

    Extract all intervals from your DFA and build interval tree out of them:

    {
        "left": null,
        "middle": {
            id: 0,
            interval: [a, z],
        },
        "right": null
    }
    

    Replace actual intervals to their ids:

    0 -> 0 -> 1
    1 -> 0 -> 1
    1 -> E -> ACCEPT
    

提交回复
热议问题