Are there any open source or commercial tools available that allow for text fragment indexing of database contents and can be queried from Java?
Background of the questi
A "real" full text index using parts of a word would be many times bigger than the source text and while the search may be faster any update or insert processing would be horibly slow.
You only hope is if there is some sort of pattern to the "mistakes' made. You could apply a set of "AI" type rules to the incoming text and produce cannonical form of the text which you could then apply a full text index to. An example for a rule could be to split a word ending in hammer into two words s/(\w?)(hammer)/\1 \2/g or to change "sledg" "sled" and "schledge" to "sledge". You would need to apply the same set of rules to the query text. In the way a product described as "sledgehammer" could be matched by a search for ' sledg hammer'.