My regex is causing a stack overflow in Java; what am I missing?

前端 未结 4 1825
攒了一身酷
攒了一身酷 2021-01-18 13:32

I am attempting to use a regular expression with Scanner to match a string from a file. The regex works with all of the contents of the file except for this line:

         


        
相关标签:
4条回答
  • 2021-01-18 13:41

    As the others have said, your regex is much less efficient than it should be. I'd take it a step further and use possessive quantifiers:

    "^([a-zA-Z]++) *+= *+\"([^\"]++)\"$"
    

    But the way you're using the Scanner doesn't make much sense, either. There's no need to use findInLine(".*") to read the line; that's what nextLine() does. And you don't need to create another Scanner to apply your regex; just use a Matcher.

    static final Pattern ANIMAL_INFO_PATTERN = 
        Pattern.compile("^([a-zA-Z]++) *+= *+\"([^\"]++)\"$");
    

    ...

      Matcher lineMatcher = ANIMAL_INFO_PATTERN.matcher("");
      while (scanFile.hasNextLine()) {
        String currentLine = scanFile.nextLine();
        if (lineMatcher.reset(currentLine).matches()) {
          matches.put(lineMatcher.group(1), lineMatcher.group(2));
        }
      }
    
    0 讨论(0)
  • 2021-01-18 13:44

    This looks like bug 5050507 . I agree with Asaph that removing the alternation should help; the bug specifically says "Avoid alternation whenever possible". I think you can go probably even simpler:

    "^([a-zA-Z]+) *= *\"([^\"]+)"
    
    0 讨论(0)
  • 2021-01-18 13:51

    Try this simplified version of your regex that removes some unnecessary | operators (which might have been causing the regex engine to do a lot of branching) and includes beginning and end of line anchors.

    static final String ANIMAL_INFO_REGEX = "^([a-zA-Z]+) *= *\"([a-zA-Z_. ]+)\"$";
    
    0 讨论(0)
  • 2021-01-18 13:57

    read this to understand the problem: http://www.regular-expressions.info/catastrophic.html ... and then use one of the other suggestions

    0 讨论(0)
提交回复
热议问题