Can someone help me to understand why using \\d* returns an array containing an empty string, whereas using \\d+ returns [\"100\"] (as expected). I get why the \\d+ works, b
*
means 0 or more, so it's matching 0 times. You need to use +
for 1 or more. By default it's greedy, so will match 100
:
var str = 'one to 100';
var regex = /\d+/;
console.log(str.match(regex));
// ["100"]
/\d*/
means "match against 0 or more numbers starting from the beginning of the string".
When you start the beginning for your string, it immediately hits a non-number and can't go any further. Yet this is considered a successful match because "0 or more".
You can try either "1 or more" via
/\d+/
or you can tell it to match "0 or more" from the end of the string:
/\d*$/
In Python, there is the findall()
method which returns all parts of the string your regular expression matched against.
re.findall(r'\d*', 'one to 100')
# => ['', '', '', '', '', '', '', '100', '']
.match()
in JavaScript, returns only the first match, which would be the first element in the above array.
Remember that match
is looking for the first substring it can find that matches the given regex.
*
means that there may be zero or more of something, so \d*
means you're looking for a string that contains zero or more digits.
If your input string started with a number, that entire number would be matched.
"5 to 100".match(/\d*/); // "5"
"5 to 100".match(/\d+/); // "5"
But since the first character is a non-digit, match()
figures that the beginning of the string (with no characters) matches the regex.
Since your string doesn't begin with any digits, an empty string is the first substring of your input which matches that regex.
As @StriplingWarrior said below, the empty string is the first match, hence it is being returned. I would like to add that you can tell what the regex is matching by noticing the 'index' field which the function match
returns. For example, this is what I get when I run your code in Chrome:
["", index: 0, input: "one to 100"]