Javascript RegExp non-capturing groups

[亡魂溺海] 提交于 2019-12-10 12:36:56

问题


I am writing a set of RegExps to translate a CSS selector into arrays of ids and classes.

For example, I would like '#foo#bar' to return ['foo', 'bar'].

I have been trying to achieve this with

"#foo#bar".match(/((?:#)[a-zA-Z0-9\-_]*)/g)

but it returns ['#foo', '#bar'], when the non-capturing prefix ?: should ignore the # character.

Is there a better solution than slicing each one of the returned strings?


回答1:


You could use .replace() or .exec() in a loop to build an Array.

With .replace():

var arr = [];
"#foo#bar".replace(/#([a-zA-Z0-9\-_]*)/g, function(s, g1) {
                                               arr.push(g1);
                                          });

With .exec():

var arr = [],
    s = "#foo#bar",
    re = /#([a-zA-Z0-9\-_]*)/g,
    item;

while (item = re.exec(s))
    arr.push(item[1]);



回答2:


It matches #foo and #bar because the outer group (#1) is capturing. The inner group (#2) is not, but that' probably not what you are checking.

If you were not using global matching mode, an immediate fix would be to use (/(?:#)([a-zA-Z0-9\-_]*)/ instead.

With global matching mode the result cannot be had in just one line because match behaves differently. Using regular expression only (i.e. no string operations) you would need to do it this way:

var re = /(?:#)([a-zA-Z0-9\-_]*)/g;
var matches = [], match;
while (match = re.exec("#foo#bar")) {
    matches.push(match[1]);
}

See it in action.




回答3:


I'm not sure if you can do that using match(), but you can do it by using the RegExp's exec() method:

var pattern = new RegExp('#([a-zA-Z0-9\-_]+)', 'g');
var matches, ids = [];

while (matches = pattern.exec('#foo#bar')) {
    ids.push( matches[1] ); // -> 'foo' and then 'bar'
}



回答4:


Unfortunately there is no lookbehind assertion in Javascript RegExp, otherwise you could do this:

/(?<=#)[a-zA-Z0-9\-_]*/g

Other than it being added to some new version of Javascript, I think using the split post processing is your best bet.




回答5:


You can use a negative lookahead assertion:

"#foo#bar".match(/(?!#)[a-zA-Z0-9\-_]+/g);  // ["foo", "bar"]



回答6:


The lookbehind assertion mentioned some years ago by mVChr is added in ECMAScript 2018. This will allow you to do this:

'#foo#bar'.match(/(?<=#)[a-zA-Z0-9\-_]*/g) (returns ["foo", "bar"])

(A negative lookbehind is also possible: use (?<!#) to match any character except for #, without capturing it.)




回答7:


MDN does document that "Capture groups are ignored when using match() with the global /g flag", and recommends using matchAll(). matchAll() isn't available on Edge or Safari iOS, and you still need to skip the complete match (including the#`).

A simpler solution is to slice off the leading prefix, if you know its length - here, 1 for #.

const results = ('#foo#bar'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);

The [] || ... part is necessary in case there was no match, otherwise match returns null, and null.map won't work.

const results = ('nothing matches'.match(/#\w+/g) || []).map(s => s.slice(1));
console.log(results);


来源:https://stackoverflow.com/questions/10864783/javascript-regexp-non-capturing-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!