Extract Url From a String

后端 未结 1 1461
南笙
南笙 2021-01-26 15:46

I have a URL:

url = \"http://timesofindia.feedsportal.com/fy/8at2EuL0ihSIb3s7/story01.htmA\"

There are some unwanted characters like A,TRE, at

相关标签:
1条回答
  • 2021-01-26 16:41

    If your url always finish with .htm, .apsx or .php you can solve it with a simple regex:

    url = url[/^(.+\.(htm|aspx|php))(:?.*)$/, 1]
    

    Tests here at Rubular.

    First I use this method to get a substring, works like slice. Then comes the regex. From left to right:

    ^                   # Start of line
      (                   # Capture everything wanted enclosed
        .+                  # 1 or more of any character
        \.                  # With a dot after it
        (htm|aspx|php)      # htm or aspx or php
      )                   # Close url asked in question
      (                   # Capture undesirable part
        :?                  # Optional
        .*                  # 0 or more any character
      )                   # Close undesirable part
    $                   # End of line
    
    0 讨论(0)
提交回复
热议问题