Detecting specific words in a textarea submission

半城伤御伤魂 提交于 2019-12-02 20:33:23

问题


I have a new feature on my site, where users can submit any text (I stopped all HTML entries) via a textarea. The main problem I still have though is that they could type "http://somewhere.com" which is something I want to stop. I also want to blacklist specific words. This is what I had before:

if (strpos($entry, "http://" or ".com" or ".net" or "www." or ".org" or ".co.uk" or "https://") !== true) {
            die ('Entries cannot contain links!');

However that didn't work, as it stopped users from submitting any text at all. So my question is simple, how can I do it?


回答1:


This is a job for Regular Expressions.

What you need to do it something like this:

// A list of words you don't allow
$disallowedWords = array(
  'these',
  'words',
  'are',
  'not',
  'allowed'
);
// Search for disallowed words.
// The Regex used here should e.g. match 'are', but not match 'care' or 'stare'
foreach ($disallowedWords as $word) {
  if (preg_match("/\s+$word\s+/i", $entry)) {
    die("The word '$word' is not allowed...");
  }
}

// This variable should contain a regex that will match URLs
// there are thousands out there, take your pick. I have just
// used an arbitrary one I found with Google
$urlRegex = '(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*';

// Search for URLs
if (preg_match($urlRegex, $entry)) {
  die("URLs are not allowed...");
}



回答2:


You must use strpos more the once. With your way you evaluate the or statement with returns true / false and pass it to strpos.

This way it should work:

if (strpos($entry, "http://") !== false || strpos($entry, "https://") !== false || strpos($entry, ".com") !== false)



回答3:


A simple way to do this is to put all the words not allowed into an array and loop through them to check each one.

$banned = array('http://', '.com', '.net', 'www.', '.org'); // Add more
foreach ($banned as $word):
    if (strpos($entry, $word) !== false) die('Contains banned word');
endforeach;

The problem with this is if you get too carried away and start banning the word 'com' or something, there are other words and phrases that could be perfectly legal that contains the letters 'com' in that way that would cause a false positive. You could use regular expressions to search for strings that look like URLs, but then you can easily just break them up like I did above. There is no effective way to completely stop people from posting links into a comment. If you don't want them there, you'll ultimately just have to use moderation. Community moderation works very well, look at Stack Overflow for instance.



来源:https://stackoverflow.com/questions/7780631/detecting-specific-words-in-a-textarea-submission

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!