Why does std::regex_iterator cause a stack overflow with this data?

前端 未结 2 908
猫巷女王i
猫巷女王i 2021-01-18 05:42

I\'ve been using std::regex_iterator to parse log files. My program has been working quite nicely for some weeks and has parsed millions of log lines, until to

2条回答
  •  暖寄归人
    2021-01-18 06:20

    Negative lookahead patterns which are tested on every character just seem like a bad idea to me, and what you're trying to do is not complicated. You want to match (1) the rest of the line and then (2) any number of following (3) lines which start with something other than L\d (small bug; see below): (another edit: these are regexes; if you want to write them as string literals, you need to change \ to \\.)

     .*\n(?:(?:[^L]|L\D).*\n)*
     |   |  |
     +-1 |  +---------------3
         +---------------------2
    

    In Ecmascript mode, . should not match \n, but you could always replace the two .s in that expression with [^\n]

    Edited to add: I realize that this may not work if there is a blank line just before the end of the log entry, but this should cover that case; I changed . to [^\n] for extra precision:

     [^\n]*\n(?:(?:(?:[^L\n]|L\D)[^\n]*)?\n)*
    

提交回复
热议问题