Are regular expressions worth the hassle?

后端 未结 24 1273
说谎
说谎 2020-12-03 13:58

It strikes me that regular expressions are not understood well by the majority of developers. It also strikes me that for a lot of problems where regular expressions are use

相关标签:
24条回答
  • 2020-12-03 14:11

    I would just like to add that unit testing is the ideal way to make your regular expressions maintainable. I consider Regex an essential developer skill that is always a practical alternative to writing many lines of string manipulation code.

    0 讨论(0)
  • 2020-12-03 14:12

    In my opinion, it might make more sense to enforce better practices with using regular expressesions other than forgoing it all together.

    • Always comment your regular expressions. You might know what it does now, but someone else might not and even you might not remember in two weeks. Moreover, descriptive comments should be used, stating exactly what the regular expression is meant to do.
    • Use unit testing. Create unit tests for your regular expressions. So can have a degree of assurance as to the reliability and correctness of your regular expression statement. And if the regex is being maintained, it would ensure that any code changes does not break existing functionality.

    Using regular expression has some advantages:

    • Time. You don't have to write your own code to do exactly what is built in.
    • Maintainability. You have to maintain only a couple of lines as opposed to 30 or 300
    • Performance. The code is optimized
    • Reliability. If your regex statement is correct, it should function correctly.
    • Flexibility. Regex gives you a lot of power which is very useful if used properly
    0 讨论(0)
  • 2020-12-03 14:13

    You raise a very good point with regards to maintainability. Regular expressions can require some deciphering to understand but I doubt the code which would replace them would be easier to maintain. Regular Expressions are VERY powerful and a valuable tool. Use them but use them carefully, and think about how to make it clear what the intent of the regular expression is.

    Regards

    0 讨论(0)
  • 2020-12-03 14:15

    Regular expressions are a domain-specific language: no generic programming language is quite as expressive or quite as efficient at doing what regular expressions do with string matching. The sheer size of the lump of code you will have to write in a standard programming language (even one with a good string library) will make it harder to maintain. It is also a good separation-of-concerns to make sure that the regular expression only does the matching. Having a code blob that basically does matching, but does something else in-between can produce some surprising bugs.

    Also note that there are mechanisms to make regular expressions more readable. In Python you can enable verbose mode, which allows you to write things like this:

    a = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)
    

    Another possibility is to build the regular expression up from strings, by line and comment each line, like this:

    a = re.compile("\d+"  # the integral part
                   "\."    # the decimal point
                   "\d *"  # fraction  digits
                   )
    

    This is possible in different ways in most programming languages. My advice is to keep using regular expressions where appropriate, but treat them like you do other code. Write them as clear as possible, comment them and test them.

    0 讨论(0)
  • 2020-12-03 14:15

    Surly all code needs to be optimized where possible!

    In the context where code need not be optimized, and the logic will need to be maintained then it is down to the skill set of the team.

    If the bulk of the team responsible for the code is regEX savvy then do it with a regEX. Else write it in the way the team is likely to be most comfortable with.

    0 讨论(0)
  • 2020-12-03 14:17

    I'm thinking in terms of maintenance of the code rather that straight line execution time.

    Code size is the single most important factor in reducing maintainability.

    And while Regexps can be very hard to decipher, so are 50 line string processing methods - and the latter are more likely to contain bugs in rare corner cases.

    The thing is: any non-trivial regexp must be commented just as thoroughly as you'd comment a 50 line method.

    0 讨论(0)
提交回复
热议问题