I\'m trying to read a text file and split the words individually using string tokenizer utility in java.
The text file looks like this;
a 2000
4
b
You need to use hasMoreTokens() method. Also addressed various coding standard issues in your code as pointed out by JB Nizet
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.StringTokenizer;
public class TestStringTokenizer {
/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {
String fileSpecified = args[0];
fileSpecified = fileSpecified.concat(".txt");
String line;
System.out.println ("file Specified = " + fileSpecified);
ArrayList <String> words = new ArrayList<String> ();
BufferedReader br = new BufferedReader (new FileReader (fileSpecified));
try{
while ((line = br.readLine()) != null) {
StringTokenizer token = new StringTokenizer (line);
while(token.hasMoreTokens())
words.add(token.nextToken());
}
} catch (IOException e) {
System.out.println (e.getMessage());
e.printStackTrace();
} finally {
br.close();
}
for (int i = 0; i < words.size(); i++) {
System.out.println ("words = " + words.get(i));
}
}
}
a) You always have to check StringTokenizer.hasMoreTokens() first. Throwing NoSuchElementException
is the documented behaviour if no more tokens are available:
token = new StringTokenizer (line);
while(token.hasMoreTokens())
words.add(token.nextToken());
b) don't create a new Tokenizer for every line, unless your file is too large to fit into memory. Read the entire file to a String and let the tokenizer work on that
This problem is due to the fact that you don't test if there is a next token before trying to get the next token. You should always test if hasMoreTokens()
before returns true
before calling nextToken()
.
But you have other bugs :
Your general approach seems sound, but you have a basic problem in your code.
Your parser is most likely failing on the second line of your input file. This line is a blank line, so when you call words.add(token.nextToken());
you get an error, because there are no tokens. This also means you'll only ever get the first token on each line.
You should iterate on the tokes like this:
while(token.hasMoreTokens())
{
words.add(token.nextToken())
}
You can find a more general example in the javadocs here:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html