Words
Table of contents
Overview
A word is a lexical element formed from a sequence of letters or letter-like
characters, such as fn or Foo or Int.
The exact lexical form of words has not yet been settled. However, Carbon will
follow lexical conventions for identifiers based on
Unicode Annex #31. TODO: Update this once
the precise rules are decided; see the
Unicode source files proposal.
Alternatives
We could restrict words to ASCII.
Advantages:
- Reduced implementation complexity.
- Avoids all problems relating to normalization, homoglyphs, text
directionality, and so on.
- We have no intention of using non-ASCII characters in the language syntax or
in any library name.
- Provides assurance that all names in libraries can reliably be typed by all
developers -- we already require that keywords, and thus all ASCII letters,
can be typed.
Disadvantages:
- An overarching goal of the Carbon project is to provide a language that is
inclusive and welcoming. A language that does not permit names in programs
to be expressed in the developer's native language will not meet that goal
for at least some of our developers.