Does anyone know of a regular expression I could use to find URLs within a string? I\'ve found a lot of regular expressions on Google for determining if an entire string is
Short and simple. I have not tested in javascript code yet but It looks it will work:
((http|ftp|https):\/\/)?(([\w.-]*)\.([\w]*))
Code on regex101.com
Using the regex provided by @JustinLevene did not have the proper escape sequences on the back-slashes. Updated to now be correct, and added in condition to match the FTP protocol as well: Will match to all urls with or without protocols, and with out without "www."
Code: ^((http|ftp|https):\/\/)?([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])?
Example: https://regex101.com/r/uQ9aL4/65
This is a simplest one. which work for me fine.
%(http|ftp|https|www)(://|\.)[A-Za-z0-9-_\.]*(\.)[a-z]*%
A probably too simplistic, but working method might be:
[localhost|http|https|ftp|file]+://[\w\S(\.|:|/)]+
I tested it on Python and as long as the string parsing contains a space before and after and none in the url (which I have never seen before) it should be fine.
Here is an online ide demonstrating it
However here are some benefits of using it:
file:
and localhost
as well as ip addresses#
or -
(see url of this post)All of the above answers are not match for Unicode characters in URL, for example: http://google.com?query=đức+filan+đã+search
For the solution, this one should work:
(ftp:\/\/|www\.|https?:\/\/){1}[a-zA-Z0-9u00a1-\uffff0-]{2,}\.[a-zA-Z0-9u00a1-\uffff0-]{2,}(\S*)
IMPROVED
Detects Urls like these:
Regex:
/^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$/gm