What characters are allowed in an email address?

后端 未结 17 2149
天命终不由人
天命终不由人 2020-11-22 00:29

I\'m not asking about full email validation.

I just want to know what are allowed characters in user-name and server parts of email address

相关标签:
17条回答
  • 2020-11-22 01:11

    A good read on the matter.

    Excerpt:

    These are all valid email addresses!
    
    "Abc\@def"@example.com
    "Fred Bloggs"@example.com
    "Joe\\Blow"@example.com
    "Abc@def"@example.com
    customer/department=shipping@example.com
    \$A12345@example.com
    !def!xyz%abc@example.com
    _somename@example.com
    
    0 讨论(0)
  • 2020-11-22 01:17

    Google do an interesting thing with their gmail.com addresses. gmail.com addresses allow only letters (a-z), numbers, and periods(which are ignored).

    e.g., pikachu@gmail.com is the same as pi.kachu@gmail.com, and both email addresses will be sent to the same mailbox. PIKACHU@gmail.com is also delivered to the same mailbox.

    So to answer the question, sometimes it depends on the implementer on how much of the RFC standards they want to follow. Google's gmail.com address style is compatible with the standards. They do it that way to avoid confusion where different people would take similar email addresses e.g.

    *** gmail.com accepting rules ***
    d.oy.smith@gmail.com   (accepted)
    d_oy_smith@gmail.com   (bounce and account can never be created)
    doysmith@gmail.com     (accepted)
    D.Oy'Smith@gmail.com   (bounce and account can never be created)
    

    The wikipedia link is a good reference on what email addresses generally allow. http://en.wikipedia.org/wiki/Email_address

    0 讨论(0)
  • 2020-11-22 01:17

    The accepted answer refers to a Wikipedia article when discussing the valid local-part of an email address, but Wikipedia is not an authority on this.

    IETF RFC 3696 is an authority on this matter, and should be consulted at section 3. Restrictions on email addresses on page 5:

    Contemporary email addresses consist of a "local part" separated from a "domain part" (a fully-qualified domain name) by an at-sign ("@"). The syntax of the domain part corresponds to that in the previous section. The concerns identified in that section about filtering and lists of names apply to the domain names used in an email context as well. The domain name can also be replaced by an IP address in square brackets, but that form is strongly discouraged except for testing and troubleshooting purposes.

    The local part may appear using the quoting conventions described below. The quoted forms are rarely used in practice, but are required for some legitimate purposes. Hence, they should not be rejected in filtering routines but, should instead be passed to the email system for evaluation by the destination host.

    The exact rule is that any ASCII character, including control characters, may appear quoted, or in a quoted string. When quoting is needed, the backslash character is used to quote the following character. For example

      Abc\@def@example.com
    

    is a valid form of an email address. Blank spaces may also appear, as in

      Fred\ Bloggs@example.com
    

    The backslash character may also be used to quote itself, e.g.,

      Joe.\\Blow@example.com
    

    In addition to quoting using the backslash character, conventional double-quote characters may be used to surround strings. For example

      "Abc@def"@example.com
    
      "Fred Bloggs"@example.com
    

    are alternate forms of the first two examples above. These quoted forms are rarely recommended, and are uncommon in practice, but, as discussed above, must be supported by applications that are processing email addresses. In particular, the quoted forms often appear in the context of addresses associated with transitions from other systems and contexts; those transitional requirements do still arise and, since a system that accepts a user-provided email address cannot "know" whether that address is associated with a legacy system, the address forms must be accepted and passed into the email environment.

    Without quotes, local-parts may consist of any combination of
    alphabetic characters, digits, or any of the special characters

      ! # $ % & ' * + - / = ?  ^ _ ` . { | } ~
    

    period (".") may also appear, but may not be used to start or end the local part, nor may two or more consecutive periods appear. Stated differently, any ASCII graphic (printing) character other than the at-sign ("@"), backslash, double quote, comma, or square brackets may appear without quoting. If any of that list of excluded characters are to appear, they must be quoted. Forms such as

      user+mailbox@example.com
    
      customer/department=shipping@example.com
    
      $A12345@example.com
    
      !def!xyz%abc@example.com
    
      _somename@example.com
    

    are valid and are seen fairly regularly, but any of the characters listed above are permitted.

    As others have done, I submit a regex that works for both PHP and JavaScript to validate email addresses:

    /^[a-z0-9!'#$%&*+\/=?^_`{|}~-]+(?:\.[a-z0-9!'#$%&*+\/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-zA-Z]{2,}$/i
    
    0 讨论(0)
  • 2020-11-22 01:18

    I created this regex according to RFC guidelines:

    ^[\\w\\.\\!_\\%#\\$\\&\\'=\\?\\*\\+\\-\\/\\^\\`\\{\\|\\}\\~]+@(?:\\w+\\.(?:\\w+\\-?)*)+$
    
    0 讨论(0)
  • 2020-11-22 01:19

    See RFC 5322: Internet Message Format and, to a lesser extent, RFC 5321: Simple Mail Transfer Protocol.

    RFC 822 also covers email addresses, but it deals mostly with its structure:

     addr-spec   =  local-part "@" domain        ; global address     
     local-part  =  word *("." word)             ; uninterpreted
                                                 ; case-preserved
    
     domain      =  sub-domain *("." sub-domain)     
     sub-domain  =  domain-ref / domain-literal     
     domain-ref  =  atom                         ; symbolic reference
    

    And as usual, Wikipedia has a decent article on email addresses:

    The local-part of the email address may use any of these ASCII characters:

    • uppercase and lowercase Latin letters A to Z and a to z;
    • digits 0 to 9;
    • special characters !#$%&'*+-/=?^_`{|}~;
    • dot ., provided that it is not the first or last character unless quoted, and provided also that it does not appear consecutively unless quoted (e.g. John..Doe@example.com is not allowed but "John..Doe"@example.com is allowed);
    • space and "(),:;<>@[\] characters are allowed with restrictions (they are only allowed inside a quoted string, as described in the paragraph below, and in addition, a backslash or double-quote must be preceded by a backslash);
    • comments are allowed with parentheses at either end of the local-part; e.g. john.smith(comment)@example.com and (comment)john.smith@example.com are both equivalent to john.smith@example.com.

    In addition to ASCII characters, as of 2012 you can use international characters above U+007F, encoded as UTF-8 as described in the RFC 6532 spec and explained on Wikipedia. Note that as of 2019, these standards are still marked as Proposed, but are being rolled out slowly. The changes in this spec essentially added international characters as valid alphanumeric characters (atext) without affecting the rules on allowed & restricted special characters like !# and @:.

    For validation, see Using a regular expression to validate an email address.

    The domain part is defined as follows:

    The Internet standards (Request for Comments) for protocols mandate that component hostname labels may contain only the ASCII letters a through z (in a case-insensitive manner), the digits 0 through 9, and the hyphen (-). The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits. No other symbols, punctuation characters, or blank spaces are permitted.

    0 讨论(0)
提交回复
热议问题