I have my Regexp example here: https://regex101.com/r/kE9mZ7/1
For the following string:
key_1: some text, maybe a comma, ending in a semicolon; key_2: text
Here is a regex way to extract those values:
/(\w+):\s*([^;]*)/gi
or (as identifiers should begin with _
or a letter):
/([_a-z]\w*):\s*([^;]*)/gi
Here is a regex demo
var re = /([_a-z]\w*):\s*([^;]*)/gi;
var str = 'key_1: some text, maybe a comma, ending in a semicolon; key_2: text with no ending semicolon';
while ((m = re.exec(str)) !== null) {
document.body.innerHTML += m[1] + ": " + m[2] + "
";
}
Pattern details:
([_a-z]\w*)
- Group 1 matching an identifier starting with _
or a letter and followed with 0+ alphanumeric/underscore symbols:
- a colon\s*
- 0+ whitespaces([^;]*)
- 0+ characters other than ;
. The use of a negated character class eliminates the need of using lazy dot matching with (?:$|;)
group after it. NOTE that *
quantifier makes the value optional. If it is required, use +
.