Detecting a (naughty or nice) URL or link in a text string
How can I detect (with regular expressions or heuristics) a web site link in a string of text such as a comment? The purpose is to prevent spam. HTML is stripped so I need to detect invitations to copy-and-paste. It should not be economical for a spammer to post links because most users could not successfully get to the page . I would like suggestions, references, or discussion on best-practices. Some objectives: The low-hanging fruit like well-formed URLs ( http://some-fqdn/some/valid/path.ext ) URLs but without the http:// prefix (i.e. a valid FQDN + valid HTTP path) Any other funny business