how to reduce complexity in regex?

前端 未结 3 1453
面向向阳花
面向向阳花 2021-01-25 23:29

I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd

(\\$)[0-9]+\\.?([0-9]         


        
3条回答
  •  生来不讨喜
    2021-01-25 23:55

    It is possible to make the regex a bit shorter by collapsing the currency indicators:
    You can say USD OR $ amount instead of USD amount OR $ amount. This results in the following regex:

    ((\$|usd)[0-9]+\.?([0-9]*))|([0-9]+\.?[0-9]*(\$|usd))
    

    Im not sure if you'll find this less complex, but at least it's easier to read because it's shorter

    The character set [0-9] can also be replaced by \d -- the character class which matches any digit -- making the regex even shorter.
    Doing this, the regex will look as follows:

    ((\$|usd)\d+\.?\d*)|(\d+\.?\d*(\$|usd))
    

    Update:

    • According to @Toto this regex would be more performant using non-capturing groups (also removed the not-necessary capture group as pointed out by @Simon MᶜKenzie):

      (?:\$|usd)\d+\.?\d*|\d+\.?\d*(?:\$|usd)
      
    • $.0 like amounts are not matched by the regex as @Gangnus pointed out. I updated the regex to fix this:

      ((\$|usd)((\d+\.?\d*)|(\.\d+)))|(((\d+\.?\d*)|(\.\d+))(\$|usd))
      

      Note that I changed \d+\.?\d* into ((\d+\.?\d*)|(\.\d+)): It now either matches one or more digits, optionally followed by a dot, followed by zero or more digits; OR a dot followed by one or more digits.

      Without unnecessary capturing groups and using non-capturing groups:

      (?:\$|usd)(?:\d+\.?\d*|\.\d+)|(?:\d+\.?\d*|\.\d+)(?:\$|usd)
      

提交回复
热议问题