Regular expression to find URLs within a string

前端 未结 27 1794
被撕碎了的回忆
被撕碎了的回忆 2020-11-22 14:18

Does anyone know of a regular expression I could use to find URLs within a string? I\'ve found a lot of regular expressions on Google for determining if an entire string is

相关标签:
27条回答
  • 2020-11-22 14:39

    Short and simple. I have not tested in javascript code yet but It looks it will work:

    ((http|ftp|https):\/\/)?(([\w.-]*)\.([\w]*))
    

    Code on regex101.com

    0 讨论(0)
  • 2020-11-22 14:39

    Using the regex provided by @JustinLevene did not have the proper escape sequences on the back-slashes. Updated to now be correct, and added in condition to match the FTP protocol as well: Will match to all urls with or without protocols, and with out without "www."

    Code: ^((http|ftp|https):\/\/)?([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])?

    Example: https://regex101.com/r/uQ9aL4/65

    0 讨论(0)
  • 2020-11-22 14:40

    This is a simplest one. which work for me fine.

    %(http|ftp|https|www)(://|\.)[A-Za-z0-9-_\.]*(\.)[a-z]*%
    
    0 讨论(0)
  • 2020-11-22 14:41

    A probably too simplistic, but working method might be:

    [localhost|http|https|ftp|file]+://[\w\S(\.|:|/)]+
    

    I tested it on Python and as long as the string parsing contains a space before and after and none in the url (which I have never seen before) it should be fine.

    Here is an online ide demonstrating it

    However here are some benefits of using it:

    • It recognises file: and localhost as well as ip addresses
    • It will never match without them
    • It does not mind unusual characters such as # or - (see url of this post)
    0 讨论(0)
  • 2020-11-22 14:45

    All of the above answers are not match for Unicode characters in URL, for example: http://google.com?query=đức+filan+đã+search

    For the solution, this one should work:

    (ftp:\/\/|www\.|https?:\/\/){1}[a-zA-Z0-9u00a1-\uffff0-]{2,}\.[a-zA-Z0-9u00a1-\uffff0-]{2,}(\S*)
    
    0 讨论(0)
  • 2020-11-22 14:45

    IMPROVED

    Detects Urls like these:

    • https://www.example.pl
    • http://www.example.com
    • www.example.pl
    • example.com
    • http://blog.example.com
    • http://www.example.com/product
    • http://www.example.com/products?id=1&page=2
    • http://www.example.com#up
    • http://255.255.255.255
    • 255.255.255.255
    • http:// www.site.com:8008

    Regex:

    /^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]@!\$&'\(\)\*\+,;=.]+$/gm
    
    0 讨论(0)
提交回复
热议问题