regex if capture group matches string

一笑奈何 提交于 2019-12-11 00:50:24

问题


I need to build a simple script to hyphenate Romanian words. I've seen several and they don't implement the rules correctly.

var words = "arta codru";

Rule: if 2 consonants are between 2 vowels, then they become split between syllables unless they belong in this array in which case both consonants move to the second syllable:

var exceptions_to_regex2 = ["bl","cl","dl","fl","gl","hl","pl","tl","vl","br","cr","dr","fr","gr","hr","pr","tr","vr"];

Expected result: ar-ta co-dru

The code so far: https://playcode.io/156923?tabs=console&script.js&output

var words = "arta codru";
var exceptions_to_regex2 = ["bl","cl","dl","fl","gl","hl","pl","tl","vl","br","cr","dr","fr","gr","hr","pr","tr","vr"];

var regex2 = /([aeiou])([bcdfghjklmnprstvwxy]{1})(?=[bcdfghjklmnprstvwxy]{1})([aeiou])/gi;

console.log(words.replace(regex2, '$1$2-'));
console.log("desired result: ar-ta co-dru");

Now I would need to do something like this:

if (exceptions_to_regex2.includes($2+$3)){
  words.replace(regex2, '$1-');
}
else {
  words.replace(regex2, '$1$2-');
}

Obviously it doesn't work because I can't just use the capture groups as I would a regular variable. Please help.


回答1:


You may code your exceptions as a pattern to check for after a vowel, and stop matching there, or you may still consume any other consonant before another vowel, and replace with the backreference to the whole match with a hyphen right after:

.replace(/[aeiou](?:(?=[bcdfghptv][lr])|[bcdfghj-nprstvwxy](?=[bcdfghj-nprstvwxy][aeiou]))/g, '$&-')

Add i modifier after g if you need case insensitive matching.

See the regex demo.

Details

  • [aeiou] - a vowel
  • (?: - start of a non-capturing group:
    • (?=[bcdfghptv][lr]) - a positive lookahead that requires the exception letter clusters to appear immediately to the right of the current position
    • | - or
    • [bcdfghj-nprstvwxy] - a consonant
    • (?=[bcdfghj-nprstvwxy][aeiou]) - followed with any consonant and a vowel
  • ) - end of the non-capturing group.

The $& in the replacement pattern is the placeholder for the whole match value (at regex101, $0 can only be used at this moment, since the Web site does not support language specific only replacement patterns).



来源:https://stackoverflow.com/questions/53360766/regex-if-capture-group-matches-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!