Representing identifiers using Regular Expression

后端 未结 2 1146
鱼传尺愫
鱼传尺愫 2021-02-18 17:33

The regular definition for recognizing identifiers in C programming language is given by

letter -> a|b|...z|A|B|...|Z|_
digit -> 0|1|...|9
identifier -&g         


        
2条回答
  •  南旧
    南旧 (楼主)
    2021-02-18 17:55

    Update: Updated regex such that identifier is not started with a digit.

    To limit the length, {} are usually used.
    For example, your regex was [a-zA-Z0-9]+. Means, allow any alphanumeric values, and the length must be greater than equals to 1. If we want to limit it not to exceed 31 characters, we can rewrite the regex as:

    [a-zA-Z0-9]{1,31}
    

    {1,31} indicates that this will accept alphanumeric values of length greater than equals to 1 and less than equals to 31.

    However, the above regex also means that the identifier can start with a digit. Note that there are three ranges provided: a-z, A-Z, and 0-9. To limit the identifier to start with an alphabet followed by alphabet or a digit, following regex can be used:

    [a-zA-Z][a-zA-Z0-9]{0-30}
    

    The first portion [a-zA-Z] forces the identifier to start with a character. It also makes sure that the identifier is not empty. The remaining portion of the regex [a-zA-Z0-9]{0-30} ensures that only characters and digits are accepted and that in addition to the first character, up to 30 more can be added to the identifier.

    You can make respective changes to your regex.

提交回复
热议问题