问题
I just had a homework assignment that wanted me to add all the Java keywords to a HashSet. Then read in a .java file, and count how many times any keyword appeared in the .java file.
The route I took was: Created an String[] array that contained all the keywords. Created a HashSet, and used Collections.addAll to add the array to the HashSet. Then as I iterated through the text file I would check it by HashSet.contains(currentWordFromFile);
Someone recommended using a HashTable to do this. Then I seen a similar example using a TreeSet. I was just curious.. what's the recommended way to do this?
(Complete code here: http://pastebin.com/GdDmCWj0 )
回答1:
Try a Map<String, Integer>
where the String is the word and the Integer is the number of times the word has been seen.
One benefit of this is that you do not need to process the file twice.
回答2:
You said "had a homework assignment" so I'm assuming you're done with this.
I would do it a bit differently. Firstly, I think some of the keywords in your String
array were incorrect. According to Wikipedia and Oracle, Java has 50 keywords. Anyway, I've commented my code fairly well. Here's what I came up with...
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;
import java.util.HashMap;
public class CountKeywords {
public static void main(String args[]) {
String[] theKeywords = { "abstract", "assert", "boolean", "break", "byte", "case", "catch", "char", "class", "const", "continue", "default", "do", "double", "else", "enum", "extends", "false", "final", "finally", "float", "for", "goto", "if", "implements", "import", "instanceof", "int", "interface", "long", "native", "new", "null", "package", "private", "protected", "public", "return", "short", "static", "strictfp", "super", "switch", "synchronized", "this", "throw", "throws", "transient", "true", "try", "void", "volatile", "while" };
// put each keyword in the map with value 0
Map<String, Integer> theKeywordCount = new HashMap<String, Integer>();
for (String str : theKeywords) {
theKeywordCount.put(str, 0);
}
FileReader fr;
BufferedReader br;
File file = new File(args[0]);
// attempt to open and read file
try {
fr = new FileReader(file);
br = new BufferedReader(fr);
String sLine;
// read lines until reaching the end of the file
while ((sLine = br.readLine()) != null) {
// if an empty line was read
if (sLine.length() != 0) {
// extract the words from the current line in the file
if (theKeywordCount.containsKey(sLine)) {
theKeywordCount.put(sLine, theKeywordCount.get(sLine) + 1);
}
}
}
} catch (FileNotFoundException exception) {
// Unable to find file.
exception.printStackTrace();
} catch (IOException exception) {
// Unable to read line.
exception.printStackTrace();
} finally {
br.close();
}
// count how many times each keyword was encontered
int occurrences = 0;
for (Integer i : theKeywordCount.values()) {
occurrences += i;
}
System.out.println("\n\nTotal occurences in file: " + occurrences);
}
}
Every time I encounter a keyword from the file, I first check if its in the Map; if it isn't, its not a valid keyword; if it is, then I update the value the keyword is associated with, i.e., I increment the associated Integer
by 1 because we've seen this keyword once more.
Alternatively, you could get rid of that last for loop and just keep a running count, so you would instead have...
if (theKeywordCount.containsKey(sLine)) {
occurrences++;
}
... and you print out the counter at the end.
I don't know if this is the most efficient way to do this, but I think its a solid start.
Let me know if you have any questions. I hope this helps.
Hristo
来源:https://stackoverflow.com/questions/5799693/most-efficient-way-to-check-file-for-list-of-words