var statementThe var statement is noted in the
language overview,
but is provisional — no justification has been provided. Variable declarations
are fundamental, and it should be clear to what degree the current syntax is
adopted.
It's expected that after the adoption of this proposal, var syntax will still
not be finalized: the proposal is an experiment.
Although constants are naturally related to variables, this proposal does not include any syntax for constants. This is expected to be revisited later.
In this proposal, "variable" is defined as an identifier referring to a mutable value.
Questions have come up about:
All of these are important features. However, in the interest of small proposals, they are out of scope of this proposal.
Variables are standard in many languages. Some various forms to consider are:
C++:
int x;
int y = 0;
bool a = true, *b = nullptr, c;
Python:
x = None
y = 0
z: int = 7 # Added by PEP 526.
Swift:
var x = 0
var y: Int = 0
var z: Int
TypeScript
let y: Number = 0;
var x = 0; # Legacy from JavaScript.
Rust
let mut x = 0;
let mut y: i32 = 0;
let mut z: i32;
Go
var x = 0
y := 0
var z int
var a, b = true, false
Visual Basic
Dim x As Integer = 3
Carbon should adopt var <type> <identifier> [ = <value> ]; syntax for variable
statements.
Considerations for this syntax are:
<type> before
<identifier> reflects the typical C++ ordering syntax.
int x[6];. Carbon should be expected to place that as part of the
type.var introducer keyword: The use of var makes it clearer for readers
to skim and see where variables are being declared. It also reduces
complexity and potential ambiguity in language parsing.var (Int x, String y) = (0, "foo");, so limiting to one declaration is not
fundamentally restrictive. However, by breaking with C++ and requiring the
full type to be specified with each identifier, we achieve two important
things:
Experiment: The ordering of type and identifier will be researched. For more information, see the alternative.
Example bison syntax for executable semantics is:
statement:
"var" expression identifier optional_assignment;
| /* preexisting statements elided */
;
optional_assignment:
/* empty */
| "=" expression
;
Carbon needs variables in order to be writable by developers. That functionality needs a syntax.
Relevant goals are:
3. Code that is easy to read, understand, and write:
5. Fast and scalable development:
7. Interoperability with and migration from existing C++ code:
var name may changeThe name var could still change. However, it's used with similar meaning in
other languages including Swift, Go, and TypeScript, and so it's reasonable to
expect it will not.
let mutThe idea that var may change includes the possibility that var may become
something like let mut in Rust. However, this is not assumed by this proposal:
let mut
appropriately
focuses on encouraging appropriate usage of features rather than restricting misuse).let.Although var (Int x, String y) = (0, "foo"); syntax is mentioned, this
proposal is not intended to propose such a syntax. It's noted primarily to
explain the likely path, that this does not rule out abbreviated syntax such
as that. That should probably be covered as part of tuples.
Pattern matching syntax in the overview uses syntax
similar to Int: x. As part of removing the
colon between type and identifier from the
provisional var syntax, that syntax should be changed to remove the :.
Details should be resolved as part of the eventual pattern matching proposal,
but if changes are needed to add a separator, the var syntax should be updated
to remain consistent. The precise form of that implementation will be part of
normal Carbon evolution.
For example, replacing fn Sum(Int: a, Int: b) -> Int; with
fn Sum(Int a, Int b) -> Int; and
case (Int: p, (Float: x, Float: _)) if (p < 13) => { with
case (Int p, (Float x, Float _)) if (p < 13) => {.
$ syntaxVariables using Type:$ and similar should drop the :, as in Type$.
Noted alternatives are key differences from C++ variable declaration syntax.
var introducer keywordThe intent of the var statement is to improve readability and parsability, and
it's related to fn for functions. Although code is more succinct without
introducers, the noted benefits are expected to be significant. Most other
modern languages use similar introducers, and so this break from C++ is adopting
a different norm.
var statement introducervar is used with a similar meaning in several other languages, including
Swift, Go, and JavaScript. let is used by TypeScript. let mut is used by
Rust, with let used for constants (this use of let alone is consistent with
other languages). In general var appears to be a more common choice.
The use of a colon (:) between the type and identifier is intended to reduce
potential parsing ambiguity, and to make reading code easier. As proposed, there
is no colon between the type and identifier.
Using a colon or other separator could make it easier to avoid certain kinds of
ambiguities. For example, suppose we decided to use a postfix * operator to
form pointer types, as in C++. In such a setup, we could have code like the
following:
var T * x = 3;
var T * x = 3 y;
In the first statement, * is a unary operator and so T* is the type and x
is the identifier. However, in the second statement, * is a binary operator
and so T * x = 3 is the type, and y is the identifier; the resulting
compiler errors may be confusing to users. Furthermore, the place of 3 could
be taken by an an arbitrarily complex expression; this could cause resolving the
ambiguity between unary and binary * to require unbounded look-ahead,
adversely impacting
code compilation time goals.
Consider instead the code:
var T *: x = 3;
var T * x = 3: y;
The colon makes it unambiguous whether the * in each case is unary or binary
with only one token of look-ahead. More importantly, this syntax immediately
calls the reader's attention to the fact that the second declaration has a
highly unusual type.
There are other ways of resolving ambiguities like this. For example, we could
avoid allowing the same operator to have both postfix and infix forms, or we
could distinguish them by the presence or absence of whitespace. However, even
if we avoid formal ambiguity by such means, a separator like : may be useful
for reducing visual ambiguity for human readers.
One of the disadvantages of : is that with var Int: x, ordering is
inconsistent with other languages using :, such as Rust and Swift, which would
say var x: Int.
It may be worth considering other syntax options. A few to consider are:
var(Int) x;
var Int# x;
var Int @x;
var Int -> x;
These aren't part of the proposed syntax mainly because it's not clear any would
gain as much support as :. However, this is an opportunity to make suggestions
and see if there's a good compromise.
The
old draft pattern matching proposal
used : as a separator. In pattern matching, the : may be particularly
important to distinguish between value matching and type name matching. However,
the pattern matching proposal should examine these choices and alternatives
before we reach a conclusion that : is necessary for pattern matching. Per
syntax ambiguity, it is expected that : has some
advantages, but may not turn out to make a compiler difference due to prevailing
constraints on type expression syntax.
This proposal suggests we
update provisional pattern matching syntax
to match the proposed var syntax.
Advantages:
Disadvantages:
: in variable statements, including Swift,
put the type after the identifier.
:.: separator will be adopted.Right now the proposal is to not have anything between the type and identifier in order to avoid cross-language ambiguity, and to retain syntax that is closer to C++. However, the ultimate decision may hinge on type and identifier ordering, as well as related future evolution.
There are many languages that put the type after the identifier. A common format
used by Swift and Rust is var x: Int.
It's worth considering the sentence-like readings:
var x Int (or var x: Int) may be read as "declare x as an int" or "make
a variable x and give it int storage".var Int x may be read as "declare an int called x" or "make a variable
with int storage called x".These readings might be of similar quality, and are presented to offer different perspectives on how to read the possible statement orderings.
Ordering is essentially a question of pairing identifiers and types. This can be cast as asking which question developers consider more important when reading code:
x?Int variable?We assert the first question is the more important one: developers will see an identifier in later code, and want to know its type. However, how do we determine which order is better for this purpose?
Unfortunately, little research has been done on this. All we're aware of right
now is a study from an unpublished undergraduate project from Germany. The study
was done in Java with 50 students. Its data indicates that it's faster to answer
question 1 if the type comes first, and faster to answer question 2 if the
identifier comes first. We do not want to make decisions based on the study
because it isn't published, studied a small group, and doesn't directly compare
possible var syntaxes; however, it still influences our thoughts.
When considering what to use for now, we can consider the popularity of various languages. The top 10 on several sources (with percentages noted by sources that have them) are:
| TIOBE | Pct | GitHut | Pct | PYPL | Pct | Octoverse |
|---|---|---|---|---|---|---|
| C | 16% | JavaScript | 19% | Python | 30% | JavaScript |
| Java | 11% | Python | 16% | Java | 17% | Python |
| Python | 11% | Java | 11% | JavaScript | 8% | Java |
| C++ | 7% | Go | 8% | C# | 7% | TypeScript |
| C# | 4% | C++ | 7% | C and C++ | 7% | C# |
| Visual Basic | 4% | Ruby | 7% | PHP | 6% | PHP |
| JavaScript | 2% | TypeScript | 7% | R | 4% | C++ |
| PHP | 2% | PHP | 6% | Objective-C | 4% | C |
| SQL | 2% | C# | 4% | Swift | 2% | Shell |
| Assembly | 2% | C | 3% | TypeScript | 2% | Ruby |
Sources:
For these languages:
<type> <identifier>, with no keyword.<identifier>: <type>, with no keyword. This was added in
Python 3.6, and reflects
language evolution.var <identifier> <type>, with no colon.var <identifier>: <type>.Dim <identifier> as <type>.DECLARE @<identifier> AS <type>, where AS is optional.First, with Carbon it's been discussed to require use of auto (or a similar
explicit syntax marker) instead of allowing developers to elide the type
entirely from a var statement. In other words, while var Int x = 0; is
valid, and var auto x = 0; is equivalent, there is no form such as
var x = 0; which removes the type entirely.
Most languages that write var x: Int also allow eliding the type when
assigning a value. For example, Go allows x := 0 and Swift allows
var x = 0;. As a result, there is no need for an auto keyword.
This would be more surprising with var Int x syntax because removing the Int
now places the identifier immediately after var, where the type normally is.
This may be subtly confusing to developers. However, if auto is required for
explicitness, the issue is moot.
Retaining auto does not eliminate the need to consider type elision as part of
advantages and disadvantages: if the type is put after the identifier and
var x: auto syntax is used, it now becomes an inconsistency with other
languages. This inconsistency would be a Carbon innovation that may confuse
developers, leading to long-term pressure to remove auto for consistency with
similar languages, and thus a disadvantage.
Advantages:
: in the type declaration.
x: Int syntax include Swift, Rust, Kotlin and
TypeScript. Go is similar but does not include :.fn, import, and package will all have the identifier immediately
after the keyword, with non-identifier content following.
fn Foo(Int bar), with var only added when a storage for a copy
is required. Thus, this advantage is primarily about the function
identifier Foo, not parameter identifiers.alias, are likely flexible and could follow var
in concept by putting the resulting name at the end. That is,
alias To = From; for consistency with var x: Int; versus
alias From as To; similar to var Int x; ordering.Disadvantages:
x: Int syntax.
:.x: Int is inconsistent with C++ variable syntax.
using To = From;, although not typedef From To;.x?"
int x syntax, including C, Java, C++, and
C#. Other notable languages include Groovy and Dart.We should conduct a larger study on the topic of type and identifier ordering and syntax. Until then, we should adopt C++-like syntax. This meets the migration sub-goal
Familiarity for experienced C++ developers with a gentle learning curve, and allows applying the higher-priority goal Code that is easy to read, understand, and write if supporting evidence is found.
Experiment: The ordering of type and identifier will be researched.