Regular expression to remove hostname and port from URL?

后端 未结 6 2101
别跟我提以往
别跟我提以往 2020-12-15 08:55

I need to write some javascript to strip the hostname:port part from a url, meaning I want to extract the path part only.

i.e. I want to write a function getPath(url

相关标签:
6条回答
  • 2020-12-15 09:43

    RFC 3986 ( http://www.ietf.org/rfc/rfc3986.txt ) says in Appendix B

    The following line is the regular expression for breaking-down a well-formed URI reference into its components.

      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
       12            3  4          5       6  7        8 9
    

    The numbers in the second line above are only to assist readability; they indicate the reference points for each subexpression (i.e., each paired parenthesis). We refer to the value matched for subexpression as $. For example, matching the above expression to

      http://www.ics.uci.edu/pub/ietf/uri/#Related
    

    results in the following subexpression matches:

      $1 = http:
      $2 = http
      $3 = //www.ics.uci.edu
      $4 = www.ics.uci.edu
      $5 = /pub/ietf/uri/
      $6 = <undefined>
      $7 = <undefined>
      $8 = #Related
      $9 = Related
    

    where <undefined> indicates that the component is not present, as is the case for the query component in the above example. Therefore, we can determine the value of the five components as

      scheme    = $2
      authority = $4
      path      = $5
      query     = $7
      fragment  = $9
    
    0 讨论(0)
  • 2020-12-15 09:43

    The window.location object has pathname, search and hash properties which contain what you require.

    for this page

    location.pathname = '/questions/441755/regular-expression-to-remove-hostname-and-port-from-url'  
    location.search = '' //because there is no query string
    location.hash = ''
    

    so you could use

    var fullpath = location.pathname+location.search+location.hash
    
    0 讨论(0)
  • 2020-12-15 09:47

    I know regular expressions are useful but they're not necessary in this situation. The Location object is inherent of all links within the DOM and has a pathname property.

    So, to access that property of some random URL you could need to create a new DOM element and then return its pathname.

    An example, which will ALWAYS work perfectly:

    function getPath(url) {
        var a = document.createElement('a');
        a.href = url;
        return a.pathname.substr(0,1) === '/' ? a.pathname : '/' + a.pathname;
    }
    

    jQuery version: (uses regex to add leading slash if needed)

    function getPath(url) {
        return $('<a/>').attr('href',url)[0].pathname.replace(/^[^\/]/,'/');
    }
    
    0 讨论(0)
  • 2020-12-15 09:48

    This regular expression seems to work: (http://[^/])(/.)

    As a test I ran this search and replace in a text editor:

     Search: (http://[^/]*)(/.*)
    Replace: Part #1: \1\nPart #2: \2  
    

    It converted this this text:

    http://host:8081/path/to/something
    

    into this:

    Part #1: http://host:8081
    Part #2: /path/to/something
    

    and converted this:

    http://stackoverflow.com/questions/441755/regular-expression-to-remove-hostname-and-port-from-url
    

    into this:

    Part #1: http://stackoverflow.com
    Part #2: /questions/441755/regular-expression-to-remove-hostname-and-port-from-url
    
    0 讨论(0)
  • 2020-12-15 09:49

    It's very simple:

    ^\w+:.*?(:)\d*
    

    Trying to find second occurance of ":" followed by number and preceded by http or https.

    This works for below two cases

    Ex:

    http://localhost:8080/myapplication

    https://localhost:8080/myapplication

    Hope this helps.

    0 讨论(0)
  • 2020-12-15 09:55

    Quick 'n' dirty:

    ^[^#]*?://.*?(/.*)$

    Everything after the hostname and port (including the initial /) is captured in the first group.

    0 讨论(0)
提交回复
热议问题