Levenshtein DFA in .NET

前端 未结 6 1518
广开言路
广开言路 2021-02-04 18:43

Good afternoon,

Does anyone know of an \"out-of-the-box\" implementation of Levenshtein DFA (deterministic finite automata) in .NET (or easily translatable to i

6条回答
  •  野性不改
    2021-02-04 19:14

    We implemented this for apache lucene java, perhaps you could convert it to C# and save yourself time.

    the main class is here: its just a builder to get Levenshtein DFAs from a string, using the Schulz and Mihov algorithm.

    http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/automaton/LevenshteinAutomata.java

    the parametric descriptions (the precomputed tables) for Lev1 and Lev2 are here: http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/automaton/Lev1ParametricDescription.java

    http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/automaton/Lev2ParametricDescription.java

    you might notice these are generated with a computer, we generated them with this script, using Jean-Phillipe Barrette's great moman implementation (python) http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/automaton/createLevAutomata.py

    we generate the parametric descriptions as packed long[] arrays so that it won't make our jar file too large.

    just modify the toAutomaton(int n) to fit your needs/DFA package. in our case we are using a modified form of the brics automaton package, where transitions are represented as unicode codepoint ranges.

    efficient unit tests are difficult for this sort of thing, but here is what we came up with... it seems to be thorough and even found a bug (which was fixed immediately by the author!) in the moman implementation.

    http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/src/test/org/apache/lucene/util/automaton/TestLevenshteinAutomata.java

提交回复
热议问题