I am just learning that language and was wondering what a more experience Java programmer would do in the following situation?
I would like to create a java program
The java API does offer the java.util.Scannerclass which will allow you to scan across an input file.
Depending on how you intend to use this, however, this might not be the best idea. Is the file very large? Are you searching only one file or are you trying to keep a database of many files and search for files within that? In that case, you might want to use a more fleshed out engine such as lucene.
As others have pointed out, you could use the Scanner
class.
I put your question in a file, data.txt
, and ran the following program:
import java.io.*;
import java.util.Scanner;
import java.util.regex.MatchResult;
public class Test {
public static void main(String[] args) throws FileNotFoundException {
Scanner s = new Scanner(new File("data.txt"));
while (null != s.findWithinHorizon("(?i)\\bjava\\b", 0)) {
MatchResult mr = s.match();
System.out.printf("Word found: %s at index %d to %d.%n", mr.group(),
mr.start(), mr.end());
}
s.close();
}
}
The output is:
Word found: Java at index 74 to 78.
Word found: java at index 153 to 157.
Word found: Java at index 279 to 283.
The pattern searched for, (?i)\bjava\b
, means the following:
(?i)
turn on the case-insensitive switch\b
means a word boundryjava
is the string searched for\b
a word boundry again.If the search term comes from the user, or if it for some other reason may contain special characters, I suggest you use \Q
and \E
around the string, as it quotes all characters in between, (and if you're really picky, make sure the input doesn't contain \E
itself).
Unless the file is very large, I would
String text = IOUtils.toString(new FileReader(filename));
boolean foundWord = text.matches("\\b" + word+ "\\b");
To find all the text between your word you can use split() and use the length of the strings to determine the position.