I\'m writing a chrome extension, and I need to split a string that contains only text and img tags, so that every element of the array is either letter or img tag. For example,
You can use exec instead of split to obtain the separated elements:
var str = 'abc<img src="jkhjhk" />d';
var myRe = /<img[^>]*>|[a-z]/gi;
var match;
var res= new Array();
while ((match = myRe.exec(str)) !== null) {
res.push(match[0]);
}
console.log(res);
The reason you get empty elements is the same why you get <img...>
inyour results. When you use capturing parentheses in a split
pattern, the result will contain the captures in the places where the delimiters were found. Since you have (<img.*?>|)
, you match (and capture) an empty string if the second alternative is used. Unfortunately, (<img.*?>)|
alone doesn't help, because you'll still get undefined
instead of empty strings. However, you can easily filter those out:
str.split(/(<img[^>]*>)|/).filter(function(el) { return el !== undefined; });
This will still get you empty elements at the beginning and the end of the string as well as between adjacent <img>
tags, though. So splitting <img><img>
would result in
["", "<img>", "", "<img>", ""]
If you don't want that, the filter function becomes even simpler:
str.split(/(<img[^>]*>)|/).filter(function(el) { return el; });