I\'m trying to validate the content of a using JavaScript, So I created a
validate()
function, which returns
The answers saying to not use regex are perfectly fine, but I like regex so:
^\s*(?:(?:\w+(?:-+\w+)*\.)+[a-z]+)\s*(?:,\s*(?:(?:\w+(?:-+\w+)*\.)+[a-z]+)\s*)*$
Yeah..it's not so pretty. But it works - tested on your sample cases at http://regex101.com
Edit: OK let's break it down. And only allow sub-domain-01.com
and a--b.com
and not -.com
Each subdomain thingo: \w+(?:-+\w+)*
matches string of word characters plus optionally some words with dashes preceeding it.
Each hostname: \s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s*
a bunch of subdomain thingos followed by a dot. Then finally followed by a string of letters only (the tld). And of course the optional spaces around the sides.
Whole thing: \s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s*(?:,\s*(?:(?:\w+(?:-\w+)*\.)+[a-z]+)\s*)*
a single hostname, followed by 0 or more ,hostname
s for our comma separated list.
Pretty simple really.
function validate() {
//Get the user input
var hostnames = document.getElementById('yourtextarea').value;
//Regex to validate hostname
var re = new RegExp(/^([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}$/);
//Trim whitespace
hostnames = hostnames.trim();
//Explode into an array
hostnames = hostnames.split(",");
//Loop through array & test each hostname with regex
var is_valid = true;
for (var i=0; i < hostnames.length; i++){
var hostname = hostnames[i].trim();
if (re.test(hostname)) {
is_valid = true; //if valid, continue loop
} else {
is_valid = false; //if invalid, break loop and return false
break;
}
} //end for loop
return is_valid;
} //end function validate()
Matches every example you indicated except "dom-ain.it, domain.com, domain.eu.org.something" because "something" is not valid.
JSFiddle: http://jsfiddle.net/nesutqjf/2/
I would not use regexps for this, because you have a lot of different rules you want to check. Regexps are good when you only have a couple of rules that are very simple to express but a pain to write out as "parsing code".
I'd simply do hostnames.split(',').forEach(validateHostname);
, as most of the comments suggest, and inside validateHostname
reject any hostname that has spaces in the middle, two adjacent dots, no dots, ends in a dot, has non-ASCII characters, has digits in the last dot-separated token, and so on and so forth.
A function like this will be much easier to add new rules to than a regexp would be.
An example with validate.js which has well tested routines for testing a valid FQDN. Alternatively look through the source and grab what you need.
function validate (e) {
var target = e.target || e;
target.value.split(',').some(function (item) {
var notValid = !validator.isFQDN(item.trim());
if (notValid) {
target.classList.add('bad');
} else {
target.classList.remove('bad');
}
return notValid;
});
}
var domains = document.getElementById('domains');
domains.addEventListener('change', validate);
validate(domains);
#domains {
width: 300px;
height: 100px;
}
.bad {
background-color: red
}
<script src="http://rawgit.com/chriso/validator.js/master/validator.js"></script>
<textarea id="domains">www.example.com, example.com, example.ca, example, example.com example.nl www.example, www.exam ple.com</textarea>
I have been using this pattern for awhile, and seems to be working for your case, too:
/^[a-zA-Z0-9][a-zA-Z0-9\-_]*\.([a-zA-Z0-9]+|[a-zA-Z0-9\-_]+\.[a-zA-Z]+)+$/gi
The logic is simple:
^[a-zA-Z0-9]
: The URL must start with an alphanumeric character[a-zA-Z0-9\-_]*
: The first alphanumeric character can be followed by zero or more of: an alphanumeric character, an underscore or a dash\.
: The first piece must be followed by a period.[a-zA-Z0-9]+
: One or more alphanumeric character, OR[a-zA-Z0-9\-_]+\.[a-zA-Z0-9]+
: One or more alphanumeric character, an underscore or a dash followed by a period and one or more alphanumeric characterYou can check this pattern working for most of your URLs in the following code snippet. How I do it is similar to the strategy described by others:
,
character$.trim()
to remove flanking whitespacesOptional, done for visual output:
$(function() {
$('textarea').keyup(function() {
var urls = $(this).val().split(',');
$('ul').empty();
$.each(urls, function(i,v) {
// Trim URL
var url = $.trim(v);
// RegEx
var pat = /^[a-zA-Z0-9][a-zA-Z0-9\-_]*\.([a-zA-Z0-9]+|[a-zA-Z0-9\-_]+\.[a-zA-Z]+)+$/gi,
test = pat.test(url);
// Append
$('ul').append('<li>'+url+' <span>'+test+'</span></li>');
});
});
});
textarea {
width: 100%;
height: 100px;
}
ul span {
background-color: #eee;
display: inline-block;
margin-left: .25em;
padding: 0 .25em;
text-transform: uppercase;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea placeholder="Paste URLs here"></textarea>
<ul></ul>
While @dandavis's answer/comment is impressive, lets break it down in to steps.
trim()
leading and ending spaces./\s+/g
. meaning find every white space occurring one or more times.,<space>
or <space>,<space>
. Split returns array.var domains = document.querySelector("textarea").value;
domains = domains.trim().replace(/\s+/g, " ").split(/\s?,\s/);
var domainsTested = domains.filter(function(element){
if (element.match(/^[a-zA-Z0-9][a-zA-Z0-9-_]{0,61}[a-zA-Z0-9]{0,1}\.([a-zA-Z]{1,6}|[a-zA-Z0-9-]{1,30}\.[a-zA-Z]{2,3})$/))
{
return element;
}
})
document.write(domainsTested.join(" | ")); //this is just here to show the results.
document.write("<br />Domainstring is ok: " + (domainsTested.length == domains.length)); //If it's valid then this should be equal.
<textarea style="width: 300px; height: 100px">www.example.com , example.com, example.ca, example, example.com example.nl www.example, www.exam ple.com, sub.sub.sub.domain.tv, do main.it, sub.domain.tv</textarea>