After reading various posts I decided not to use REGEX to check if an email is valid and simply use PHP\'s inbuilt filter_var function. It seemed to work ok, until it starte
The regular expression used in the PHP 5.3.3 filter code is based on Michael Rushton's blog about Email Address Validation. It does seem to work for the case you mention.
You could also check out some of the options in Comparing E-mail Address Validating Regular Expressions (the regexp currently used in PHP is one of those tested).
Then you could choose a regexp you like better, and use it in a call to preg_match()
.
Or else you could take the regexp and replace the one in file PHP/ext/filter/logical_filter.c, function php_filter_validate_email()
, and rebuild PHP.
function isValidEmail($email, $checkDNS = false)
{
$valid = (
/* Preference for native version of function */
function_exists('filter_var') and filter_var($email, FILTER_VALIDATE_EMAIL)
) || (
/* The maximum length of an e-mail address is 320 octets, per RFC 2821. */
strlen($email) <= 320
/*
* The regex below is based on a regex by Michael Rushton.
* However, it is not identical. I changed it to only consider routeable
* addresses as valid. Michael's regex considers a@b a valid address
* which conflicts with section 2.3.5 of RFC 5321 which states that:
*
* Only resolvable, fully-qualified domain names (FQDNs) are permitted
* when domain names are used in SMTP. In other words, names that can
* be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed
* in Section 5) are permitted, as are CNAME RRs whose targets can be
* resolved, in turn, to MX or address RRs. Local nicknames or
* unqualified names MUST NOT be used.
*
* This regex does not handle comments and folding whitespace. While
* this is technically valid in an email address, these parts aren't
* actually part of the address itself.
*/
and preg_match_all(
'/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?))'.
'{255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?))'.
'{65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|'.
'(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))'.
'(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|'.
'(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|'.
'(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})'.
'(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126})'.'{1,}'.
'(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|'.
'(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|'.
'(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::'.
'(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|'.
'(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|'.
'(?:(?!(?:.*[a-f0-9]:){5,})'.'(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::'.
'(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|'.
'(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|'.
'(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD',
$email)
);
if( $valid )
{
if( $checkDNS && ($domain = end(explode('@',$email, 2))) )
{
/*
Note:
Adding the dot enforces the root.
The dot is sometimes necessary if you are searching for a fully qualified domain
which has the same name as a host on your local domain.
Of course the dot does not alter results that were OK anyway.
*/
return checkdnsrr($domain . '.', 'MX');
}
return true;
}
return false;
}
//-----------------------------------------------------------------
var_dump(isValidEmail('nechtan@tagon8inc.com', true));
// bool(true)
that filter has been revamped recently. http://codepad.org/Lz5m2S2N - appears that in version used by codepad your case is filtered correctly
You can also look at http://bugs.php.net/49576 and http://svn.php.net/viewvc/php/php-src/trunk/ext/filter/logical_filters.c . Regexp is quite scary.
name2@domain.com seems to work fine: http://codepad.org/5HDgMW5i
But I've definitely seen people complaining it's got problems, even on SO. In all likelihood, it does have problems, but so will a regex solution. The email address specifications are very, very complicated (RFC XXXX).
That's why the only solution to verify emails you should rely on is sending an email to the address and demand action (eg: if it's a registration script ask them to click on a verification link).