non-greedy

Why is this simple .*? non-greedy regex being greedy?

China☆狼群 提交于 2019-11-26 06:09:03
问题 I have a very simple regex similar to this: HOHO.*?_HO_ With this test string... fiwgu_HOHO_HOHO_HOHOrgh_HOHO_feh_HOHO___HO_fbguyev I expect it to match just _HOHO___HO_ (shortest match, non-greedy) Instead it matches _HOHO_HOHO_HOHOrgh_HOHO_feh_HOHO___HO_ (longest match, looks greedy). Why? How can I make it match the shortest match? Adding and removing the ? gives the same result. Edit - better test string that shows why [^HOHO] doesn\'t work: fiwgu_HOHO_HOHO_HOHOrgh_HOHO_feh_HOHO_H_O_H_O

How can I write a regex which matches non greedy? [duplicate]

早过忘川 提交于 2019-11-26 03:16:50
问题 This question already has answers here : How do I match any character across multiple lines in a regular expression? (23 answers) Closed 3 months ago . I need help about regular expression matching with non-greedy option. The match pattern is: <img\\s.*> The text to match is: <html> <img src=\"test\"> abc <img src=\"a\" src=\'a\' a=b> </html> I test on http://regexpal.com This expression matches all text from <img to last > . I need it to match with the first encountered > after the initial

What do &#39;lazy&#39; and &#39;greedy&#39; mean in the context of regular expressions?

这一生的挚爱 提交于 2019-11-25 21:35:15
问题 Could someone explain these two terms in an understandable way? 回答1: Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+> . Suppose you have the following: <em>Hello World</em> You may think that <.+> ( . means any non newline character and + means one or more ) would only match the <em> and the </em> , when in reality it will be very greedy, and go from the first < to the last > . This means it