Hashtable/dictionary/map lookup with regular expressions

前端 未结 19 1280
难免孤独
难免孤独 2021-02-01 05:36

I\'m trying to figure out if there\'s a reasonably efficient way to perform a lookup in a dictionary (or a hash, or a map, or whatever your favorite language calls it) where the

19条回答
  •  情歌与酒
    2021-02-01 06:17

    What you want to do is very similar to what is supported by xrdb. They only support a fairly minimal notion of globbing however.

    Internally you can implement a larger family of regular languages than theirs by storing your regular expressions as a character trie.

    • single characters just become trie nodes.
    • .'s become wildcard insertions covering all children of the current trie node.
    • *'s become back links in the trie to node at the start of the previous item.
    • [a-z] ranges insert the same subsequent child nodes repeatedly under each of the characters in the range. With care, while inserts/updates may be somewhat expensive the search can be linear in the size of the string. With some placeholder stuff the common combinatorial explosion cases can be kept under control.
    • (foo)|(bar) nodes become multiple insertions

    This doesn't handle regexes that occur at arbitrary points in the string, but that can be modeled by wrapping your regex with .* on either side.

    Perl has a couple of Text::Trie -like modules you can raid for ideas. (Heck I think I even wrote one of them way back when)

提交回复
热议问题