问题
I have a regex email pattern and would like to strip all but pattern-matched characters from the string, in a short I want to sanitize string...
I'm not a regex guru, so what I'm missing in regex?
<?php
$pattern = "/^([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,6})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)$/i";
$email = 'contact<>@domain.com'; // wrong email
$sanitized_email = preg_replace($pattern, NULL, $email);
echo $sanitized_email; // Should be contact@domain.com
?>
Pattern taken from: http://fightingforalostcause.net/misc/2006/compare-email-regex.php (the very first one...)
回答1:
You cannot filter and match at the same time. You'll need to break it up into a character class for stripping invalid characters and a matching regular expression which verifies a valid address.
$email = preg_replace($filter, "", $email);
if (preg_match($verify, $email)) {
// ok, sanitized
return $email;
}
For the first case, you want to use a negated character class /[^allowedchars]/
.
For the second part you use the structure /^...@...$/
.
Have a look at PHPs filter extension. It uses const unsigned char allowed_list[] = LOWALPHA HIALPHA DIGIT "!#$%&'*+-=?^_\
{|}~@.[]";` for cleansing.
And there is the monster for validation: line 525 in http://gcov.php.net/PHP_5_3/lcov_html/filter/logical_filters.c.gcov.php - but check out http://www.regular-expressions.info/email.html for a more common and shorter variant.
回答2:
i guess filter_var php function can also do this functionality, and in a cleaner way. Have a look at: http://www.php.net/manual/en/function.filter-var.php
example:
$email = "chris@exam\\ple.com";
$cleanEmail = filter_var($email, FILTER_SANITIZE_EMAIL); // chris@example.com
来源:https://stackoverflow.com/questions/4914062/php-preg-replace-pattern-string-sanitization