Why are people using regexp for email and other complex validation?

后端 未结 12 1308
感情败类
感情败类 2020-12-16 16:01

There are a number of email regexp questions popping up here, and I\'m honestly baffled why people are using these insanely obtuse matching expressions rather than a very si

相关标签:
12条回答
  • 2020-12-16 16:38

    Regexps are much faster to use, of course, and they only validate what's specified in the RFC. Write a custom parser? What? It takes 10 seconds to use a regexp.

    0 讨论(0)
  • 2020-12-16 16:41

    On factor: the set of people who understand how to write a regular expression is very much larger than the set of people who understand the formal constraints on regular languages. Same goes for non-regular "regular expressions".

    0 讨论(0)
  • 2020-12-16 16:45

    I don't believe correct email validation can be done with a single regular expression (now there's a challenge!). One of the issues is that comments can be nested to an arbitrary depth in both the local part and the domain.

    If you want to validate an address against RFCs 5322 and 5321 (the current standards) then you'll need a procedural function to do so.

    Fortunately, this is a commodity problem. Everybody wants the same result: RFC compliance. There's no need for anybody to write this code ever again once it's been solved by an open source function.

    Check out some of the alternatives here: http://www.dominicsayers.com/isemail/

    If you know of another function that I can add to the head-to-head, let me know.

    0 讨论(0)
  • 2020-12-16 16:47

    People do it because in most languages it is way easier to write regexp than to write and use a parser in your code (or so it seems, at least).

    If you decide to eschew regexes, you will have to either write parsers by hand, or you resort to external tools (like yacc) for lexer/parser generation. This is way more complex than single-line regex match.

    One need to have a library that makes it easy to write parsers directly in the language X (where 'X' is C, C++, C#, Java) to be able to build custom parsers with the same ease as regular expression matchers.

    Such libraries originated in the functional land (Haskell and ML), but nowadays "parser combinators libraries" exist for Java, C++, C#, Scala and other mainstream languages.

    0 讨论(0)
  • 2020-12-16 16:49

    People use regexes for email addresses, HTML, XML, etc. because:

    1. It looks like they should work and they often do work for the obvious cases.
    2. They "know" regular expressions. When all you have is a hammer all your problems look like nails.
    3. Writing a parser is harder (or seems harder) than writing a regular expression. In particular, writing a parser is harder than writing a regex that handles the obvious cases in #1.
    4. They don't understand the full complexity of the task.
    5. They don't understand the limitations of regular expressions.
    6. They start with a regex that handles the obvious cases and then try to extend it to handle others. They get locked into one approach.
    7. They aren't aware that there's (probably) a library available to do the work for them.
    0 讨论(0)
  • 2020-12-16 16:50

    They do it because they see "I want to test whether this text matches the spec" and immediately think "I know, I'll use a regex!" without fully understanding the complexity of the spec or the limitations of regexes. Regexes are a wonderful, powerful tool for handling a wide variety of text-matching tasks, but they are not the perfect tool for every such task and it seems that many people who use them lose sight of that fact.

    0 讨论(0)
提交回复
热议问题