How to remove email addresses and links from a string in PHP?

后端 未结 5 876
终归单人心
终归单人心 2020-12-05 22:17

How do I remove all email addresses and links from a string and replace them with \"[removed]\"

相关标签:
5条回答
  • 2020-12-05 22:47

    Try this:

    $patterns = array('<[\w.]+@[\w.]+>', '<\w{3,6}:(?:(?://)|(?:\\\\))[^\s]+>');
    $matches = array('[email removed]', '[link removed]');
    $newString = preg_replace($patterns, $matches, $stringToBeMatched);
    

    Note: you can pass an array of patterns and matches into preg_replace instead of running it twice.

    0 讨论(0)
  • 2020-12-05 22:49

    You can use preg_replace to do it.

    for emails:

    $pattern = "/[^@\s]*@[^@\s]*\.[^@\s]*/";
    $replacement = "[removed]";
    preg_replace($pattern, $replacement, $string);
    

    for urls:

    $pattern = "/[a-zA-Z]*[:\/\/]*[A-Za-z0-9\-_]+\.+[A-Za-z0-9\.\/%&=\?\-_]+/i";
    $replacement = "[removed]";
    preg_replace($pattern, $replacement, $string);
    

    Resources

    PHP manual entry: http://php.net/manual/en/function.preg-replace.php

    Credit where credit is due: email regex taken from preg_match manpage, and URL regex taken from: http://www.weberdev.com/get_example-4227.html

    0 讨论(0)
  • 2020-12-05 22:52

    There are a lot of characters valid in the first local part of the email (see What characters are allowed in an email address?), so these lines would replace all valid email addresses:

    <?php
    $c='a-zA-Z-_0-9'; // allowed characters in domainpart
    $la=preg_quote('!#$%&\'*+-/=?^_`{|}~', "/"); // additional allowed in first localpart
    $email="[$c$la][$c$la\.]*[^.]@[$c]+\.[$c]+";
    $t = preg_replace("/\b($email)\b/", '[removed]', $t);
    // or with a link:
    $t = preg_replace("/\b($email)\b/", '<a href="mailto:\1">\1</a>', $t);
    
    # replace urls:
    a='A-Za-z0-9\-_';
    $t = preg_replace("/[htpsftp]+[:\/\/]+[$a]+\.+[$a\.\/%&=\?]+/i", '[removed]', $t);
    

    This will cover most valid email addresses, be informed: removing really only all valid email addresses is a bit more complex (see How to validate an email address using a regular expression?)

    0 讨论(0)
  • 2020-12-05 22:59

    My answer is a variation of Josiah's /[^@\s]*@[^@\s]*\.[^@\s]*/ for emails, which works fine but also matches any puctuation after the email address itself: demo 1

    Adapt the regex as follows /[^@\s]*@[^@\s\.]*\.[^@\s\.,!?]*/ to exclude . , ! and ?: demo 2

    0 讨论(0)
  • 2020-12-05 23:04

    The answer I was going to upvote was deleted. It linked to a Linux Journal article Validate an E-Mail Address with PHP, the Right Way that points out what's wrong with almost every email regex anyone proposes.

    The range of valid forms of an email address is much broader than most people think.

    0 讨论(0)
提交回复
热议问题