reluctant-quantifiers

Writing better regex expression for not using lazy repeat quantifier

风格不统一 提交于 2019-12-12 10:53:15
问题 I have a regular expression: (<select([^>]*>))(.*?)(</select\s*>) Since it uses lazy repeat quantifier, for longer strings(having options more than 500) it backtracks for more than 100,000 times and fails. Please help me to find a better regular expression which doesn't use lazy repeat quantifier 回答1: <select[^>]*>[^<]*(?:<(?!/select>)[^<]*)*</select> ...or in human-readable form: <select[^>]*> # start tag [^<]* # anything except opening bracket (?: # if you find an open bracket <(?!/select>)

Java Regexp: UNGREEDY flag

℡╲_俬逩灬. 提交于 2019-12-12 04:54:15
问题 I'd like to port a generic text processing tool, Texy!, from PHP to Java. This tool does ungreedy matching everywhere, using preg_match_all("/.../U") . So I am looking for a library, which has some UNGREEDY flag. I know I could use the .*? syntax, but there are really many regular expressions I would have to overwrite, and check them with every updated version. I've checked ORO - seems to be abandoned Jakarta Regexp - no support java.util.regex - no support Is there any such library? Thanks,

Regex: Is Lazy Worse?

喜欢而已 提交于 2019-12-10 01:23:39
问题 I have always written regexes like this <A HREF="([^"]*)" TARGET="_blank">([^<]*)</A> but I just learned about this lazy thing and that I can write it like this <A HREF="(.*?)" TARGET="_blank">(.*?)</A> is there any disadvantage to using this second approach? The regex is definitely more compact (even SO parses it better). Edit : There are two best answers here, which point out two important differences between the expressions. ysth's answer points to a weakness in the non-greedy/lazy one, in