Clojure Regex: If string is a URL, return string

浪子不回头ぞ 提交于 2019-12-24 04:42:10

问题


How can I return a valid URL given a string in Clojure.

 (re-matches #"????" "www.example.com"))
 (re-matches #"????" "http://example.com"))
 (re-matches #"????" "http://example.org")) // returns "http://example.org"
 (re-matches #"????" "htasdtp:/something")) // returns nil

回答1:


Validating URL is not simple. Perhaps it's too complex to validate with regexp. Fortunately, there's a library called Apache Commons, which contains UrlValidator.

Since Clojure can use Java library, you can use Apache Commons' UrlValidator to validate URL in your program.

First, add dependency in your project.clj. Add the following line in your dependency vector.

[commons-validator "1.4.1"]

And then, you can define a function, valid-url? which returns boolean.

(import 'org.apache.commons.validator.UrlValidator)

(defn valid-url? [url-str]
  (let [validator (UrlValidator.)]
    (.isValid validator url-str)))

Now, you can do what you want with this function. Or you can modify the above function to return the URL string when it's argument is valid URL.




回答2:


Asking how to validate URLs in ClojureScript is basically asking how to do it in Javascript, as ClojureScript regular expressions compile to native JavaScript regular expressions.

This is a page with lots of variants on how to validate URLs using Regular Expressions: https://mathiasbynens.be/demo/url-regex

This is Diego Pierini's Javascript solution:

/^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$/i

In ClojureScript:

(def url-pattern #"(?i)^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$")

(re-matches url-pattern "http://www.google.com")


来源:https://stackoverflow.com/questions/28269117/clojure-regex-if-string-is-a-url-return-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!