问题
I am looking for a Java implementation of the Generalized Suffix Tree (GST) with the following features:
After the creation of the GST from say 1000 strings I would like find out how many of these 1000 strings contains some other string 's'.
The search must be quiet fast, as I need to apply the search on about 100'000 candidate strings of average length 10.
回答1:
Try The Semantic Discovery Toolkit. It has an implementation on text/src/java/org/sd/text/radixtree
回答2:
There is a Java implementation of a Non-General Suffix Tree is available at: http://illya-keeplearning.blogspot.com/2009/04/suffix-trees-java-ukkonens-algorithm.html
回答3:
I created a suffix tree in Java that allows you easily add your own search functionality and other matching algorithms. My blog post, Suffix Trees in Java, has an overview as well as instructions for downloaded the latest version. My Java implementation is based on Mark Nelson's Fast String Searching With Suffix Trees article.
Update 2016-06-18
- The library containing the suffix tree implementation described above is now available at https://bitbucket.org/globalmentor/globalmentor-core .
- The latest version of the library is available from Maven Central at http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.globalmentor%22%20AND%20a%3A%22globalmentor-core%22 .
回答4:
You can find an implementation of a Generalized Suffix Tree in Java here. I tried to document it as much as I could, so you might find it useful.
回答5:
Here is my implementation of SuffixTree: https://github.com/losvald/sglj/blob/master/src/main/java/org/sglj/util/PATTrie.java
Among other things, it supports storing arbitrary data in nodes, and finding the set of values associated with the prefix.
来源:https://stackoverflow.com/questions/969448/generalized-suffix-tree-java-implementation