I am writing a piece of code in which i have to find only complete words for example if i have
String str = \"today is tuesday\";
and I\'m
String[] tokens = str.split(" ");
for(String s: tokens) {
if ("t".equals(s)) {
// t exists
break;
}
}
String[] words = str.split(" ");
Arrays.sort(words);
Arrays.binarySearch(words, searchedFor);
String sentence = "Today is Tuesday";
Set<String> words = new HashSet<String>(
Arrays.asList(sentence.split(" "))
);
System.out.println(words.contains("Tue")); // prints "false"
System.out.println(words.contains("Tuesday")); // prints "true"
Each contains(word)
query is O(1)
, so short of implementing your own sophisticated dictionary data structure, this is the fastest most practical solution if you have many words to look for in a text.
This uses String.split to separate out the words from the sentence on the " "
delimiter. Other possible variations, depending on how the problem is defined, is to use \b
, the word boundary anchor. The problem is considerably more difficult if you must take every grammatical features of natural languages into consideration (e.g. "can't"
is split by \b
into "can"
and "t"
).
Case insensitivity can be easily introduced by using the traditional case normalization trick: split and hash sentence.toLowerCase()
instead, and see if it contains(word.toLowerCase())
.