问题
I am writing a set of RegExps to translate a CSS selector into arrays of ids and classes.
For example, I would like '#foo#bar' to return ['foo', 'bar'].
I have been trying to achieve this with
"#foo#bar".match(/((?:#)[a-zA-Z0-9\-_]*)/g)
but it returns ['#foo', '#bar'], when the non-capturing prefix ?: should ignore the # character.
Is there a better solution than slicing each one of the returned strings?
回答1:
You could use .replace()
or .exec()
in a loop to build an Array.
With .replace()
:
var arr = [];
"#foo#bar".replace(/#([a-zA-Z0-9\-_]*)/g, function(s, g1) {
arr.push(g1);
});
With .exec()
:
var arr = [],
s = "#foo#bar",
re = /#([a-zA-Z0-9\-_]*)/g,
item;
while (item = re.exec(s))
arr.push(item[1]);
回答2:
It matches #foo
and #bar
because the outer group (#1) is capturing. The inner group (#2) is not, but that' probably not what you are checking.
If you were not using global matching mode, an immediate fix would be to use (/(?:#)([a-zA-Z0-9\-_]*)/
instead.
With global matching mode the result cannot be had in just one line because match behaves differently. Using regular expression only (i.e. no string operations) you would need to do it this way:
var re = /(?:#)([a-zA-Z0-9\-_]*)/g;
var matches = [], match;
while (match = re.exec("#foo#bar")) {
matches.push(match[1]);
}
See it in action.
回答3:
I'm not sure if you can do that using match(), but you can do it by using the RegExp's exec() method:
var pattern = new RegExp('#([a-zA-Z0-9\-_]+)', 'g');
var matches, ids = [];
while (matches = pattern.exec('#foo#bar')) {
ids.push( matches[1] ); // -> 'foo' and then 'bar'
}
回答4:
Unfortunately there is no lookbehind assertion in Javascript RegExp, otherwise you could do this:
/(?<=#)[a-zA-Z0-9\-_]*/g
Other than it being added to some new version of Javascript, I think using the split
post processing is your best bet.
回答5:
You can use a negative lookahead assertion:
"#foo#bar".match(/(?!#)[a-zA-Z0-9\-_]+/g); // ["foo", "bar"]
回答6:
The lookbehind assertion mentioned some years ago by mVChr is added in ECMAScript 2018. This will allow you to do this:
'#foo#bar'.match(/(?<=#)[a-zA-Z0-9\-_]*/g)
(returns ["foo", "bar"]
)
(A negative lookbehind is also possible: use (?<!#)
to match any character except for #, without capturing it.)
回答7:
MDN does document that "Capture groups are ignored when using match() with the global /g flag", and recommends using matchAll()
. matchAll() isn't available on Edge or Safari iOS, and you still need to skip the complete match (including the
#`).
A simpler solution is to slice off the leading prefix, if you know its length - here, 1 for #
.
const results = ('#foo#bar'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);
The [] || ...
part is necessary in case there was no match, otherwise match
returns null, and null.map
won't work.
const results = ('nothing matches'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);
来源:https://stackoverflow.com/questions/10864783/javascript-regexp-non-capturing-groups