Why does checking this string with Regex.IsMatch cause CPU to reach 100%?

后端 未结 3 1691
执念已碎
执念已碎 2021-02-19 02:13

When using Regex.IsMatch (C#, .Net 4.5) on a specific string, the CPU reaches 100%.

String:

https://www.facebook.com/CashKingPirates/ph         


        
相关标签:
3条回答
  • 2021-02-19 02:20

    I suggest you to check http://regexr.com/ website, to test your regular expression.

    The corrected version of your regular expression is this:

    ^(https?://(?:[\w]+\.?[\w]+)+[\w]/?)([\w\./]+)(\?[\w-=&%]+)?$
    

    It also has 3 groups:

    1. group1=Main url (for example: facebook.com)
    2. group2=Sub urls (for example: /CashKingPirates/photos/a.197028616990372.62904.196982426994991/1186500984709792/
    3. group3=Variables (for example: ?type=1&permPage=1)

    Also remember for checking actual character of dot (.) in your regular expression you must use \. not .

    0 讨论(0)
  • 2021-02-19 02:23

    As nu11p01n73R pointed out, you have a lot backtracking with your regular expression. That’s because parts of your expression can all match the same thing, which gives the engine many choices it has to try before finding a result.

    You can avoid this by changing the regular expression to make individual sections more specific. In your case, the cause is that you wanted to match a real dot but used the match-all character . instead. You should escape that to \..

    This should already reduce the backtracking need a lot and make it fast:

    ^http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=])?$
    

    And if you want to actually match the original string, you need to add a quantifier to the character class at the end:

    ^http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]+)?$
                                               ↑
    
    0 讨论(0)
  • 2021-02-19 02:29

    Your regex suffers for catastrophic backtracking.You can simply use

    ^http(s)?://([\w.-])+(/[\w ./?%&=-]+)*$
    

    See demo.

    https://regex101.com/r/cK4iV0/15

    0 讨论(0)
提交回复
热议问题