Why are special characters not allowed in variable names?

问题

Why special character( except underscore) are not allowed in variable name of programming language ? Is there are any reason related to computer architecture or organisation.

回答1:

Most languages have long histories, using ASCII character sets. Those languages tend to have simple identifier descriptions (e.g., starts with A-Z, followed by A-Z,0-9, maybe underscore; COBOL allows "-" as part of a name). When all you had was an 029 keypunch or a teletype, you didn't have many other characters, and most of them got used as operator syntax or punctuation.

On older machines, this did have the advantage that you could encode an identifier as a radix 37 (A-Z,0-9, null) [6 characters in 32 bits] or radix 64 (A-Z,a-z,0-9,underscore and null) numbers [6 characters in 36 bits, a common word size in earlier generations of machines) for small symbol tables. A consequence: many older languages had 6 character limits on identifier sizes (e.g., FORTRAN).

LISP languages have long been much more permissive; names can be anything but characters with special meaning to LISP, e.g., ( ) [ ] ' ` #, and usually there are ways to insert these characters into names using some kind of escape convention. Our PARLANSE language is like LISP; it uses "~" as an escape, so you can write ~(begin+~)end as a single identifier whose actual spelling is "(begin+end)".

More modern languages (Java, C#, Scala, ...., uh, even PARLANSE) grew up in an era of Unicode, and tend to allow most of unicode in identifiers (actually, they tend to allow named Unicode subsets as parts of identifiers). An identifier made of chinese characters is perfectly legal in such languages.

Its kind of a matter of taste in the Western hemisphere: most identifier names still tend to use just letters and digits (sometimes, Western European letters). I don't know what the Japanese and Chinese really use for identifier names when they have Unicode capable character sets; what little Asian code I have seen tends to follow western identifier conventions but the comments tend to use much more of the Unicode character set.

回答2:

Fundamentally it is because they're mostly used as operators or separators, so it would introduce ambiguity.

Is there any reason relate to computer architecture or organization.

No. The computer can't see the variable names. Only the compiler can. But it has to be able to distinguish a variable name from two variable names separated by an operator, and most language designers have adopted the principle that the meaning of a computer program should not be affected by white space.

来源：https://stackoverflow.com/questions/24250231/why-are-special-characters-not-allowed-in-variable-names

标签

programming-languages