Error whilst using StringTokenizer on text file with multiple lines

前端 未结 4 1204
我在风中等你
我在风中等你 2021-01-07 11:11

I\'m trying to read a text file and split the words individually using string tokenizer utility in java.

The text file looks like this;

a 2000

4  
b         


        
相关标签:
4条回答
  • 2021-01-07 11:41

    You need to use hasMoreTokens() method. Also addressed various coding standard issues in your code as pointed out by JB Nizet

    import java.io.BufferedReader;
    import java.io.FileReader;
    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.StringTokenizer;
    
    public class TestStringTokenizer {
    
        /**
         * @param args
         * @throws IOException 
         */
        public static void main(String[] args) throws IOException {
            String fileSpecified = args[0];
    
            fileSpecified = fileSpecified.concat(".txt");
            String line;
            System.out.println ("file Specified = " + fileSpecified);
    
            ArrayList <String> words = new ArrayList<String> ();
    
            BufferedReader br =  new BufferedReader (new FileReader (fileSpecified));
            try{
                while ((line  = br.readLine()) != null) {
                    StringTokenizer token = new StringTokenizer (line);
                    while(token.hasMoreTokens())
                        words.add(token.nextToken());
                }
            } catch (IOException e) {
                System.out.println (e.getMessage());
                e.printStackTrace();
            } finally {
                br.close();
            }
    
            for (int i = 0; i < words.size(); i++) {
                System.out.println ("words = " + words.get(i));
            }
        }
    }
    
    0 讨论(0)
  • 2021-01-07 11:52

    a) You always have to check StringTokenizer.hasMoreTokens() first. Throwing NoSuchElementException is the documented behaviour if no more tokens are available:

    token = new StringTokenizer (line);
    while(token.hasMoreTokens())
        words.add(token.nextToken());
    

    b) don't create a new Tokenizer for every line, unless your file is too large to fit into memory. Read the entire file to a String and let the tokenizer work on that

    0 讨论(0)
  • 2021-01-07 11:56

    This problem is due to the fact that you don't test if there is a next token before trying to get the next token. You should always test if hasMoreTokens() before returns true before calling nextToken().

    But you have other bugs :

    • The first line is read, but not tokenized
    • You only add the first word of each line to your list of words
    • bad practice : the token variable should be declared inside the loop, and not outside
    • you don't close your reader in a finally block
    0 讨论(0)
  • 2021-01-07 12:03

    Your general approach seems sound, but you have a basic problem in your code.

    Your parser is most likely failing on the second line of your input file. This line is a blank line, so when you call words.add(token.nextToken()); you get an error, because there are no tokens. This also means you'll only ever get the first token on each line.

    You should iterate on the tokes like this:

    while(token.hasMoreTokens())
    {
        words.add(token.nextToken())
    }
    

    You can find a more general example in the javadocs here:

    http://download.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html

    0 讨论(0)
提交回复
热议问题