The regular definition for recognizing identifiers in C programming language is given by
letter -> a|b|...z|A|B|...|Z|_
digit -> 0|1|...|9
identifier -&g
Update: Updated regex such that identifier is not started with a digit.
To limit the length, {}
are usually used.
For example, your regex was [a-zA-Z0-9]+
. Means, allow any alphanumeric values, and the length must be greater than equals to 1. If we want to limit it not to exceed 31 characters, we can rewrite the regex as:
[a-zA-Z0-9]{1,31}
{1,31} indicates that this will accept alphanumeric values of length greater than equals to 1 and less than equals to 31.
However, the above regex also means that the identifier can start with a digit. Note that there are three ranges provided: a-z, A-Z, and 0-9. To limit the identifier to start with an alphabet followed by alphabet or a digit, following regex can be used:
[a-zA-Z][a-zA-Z0-9]{0-30}
The first portion [a-zA-Z]
forces the identifier to start with a character. It also makes sure that the identifier is not empty. The remaining portion of the regex [a-zA-Z0-9]{0-30}
ensures that only characters and digits are accepted and that in addition to the first character, up to 30 more can be added to the identifier.
You can make respective changes to your regex.
The regular expression you are looking for is:
[_a-zA-Z][_a-zA-Z0-9]{0,30}
It will match an underscore or letter following by X
underscores, letters or numbers, where 0 <= X <= 30