Browse Source

var ordering (#618)

Propose the decision from #542, noting implementation from #563

Also integrates some of #339 into `variables.md` because that's actually how this started, looking for a proposal reference for #542 

Co-authored-by: Richard Smith <richard@metafoo.co.uk>
Co-authored-by: josh11b <josh11b@users.noreply.github.com>
Jon Meow 4 years ago
parent
commit
199c7e365c

+ 2 - 6
docs/design/README.md

@@ -278,8 +278,7 @@ Some common expressions in Carbon include:
 
 ### Functions
 
-> References: [Functions](functions.md) and
-> [syntactic conventions](syntactic_conventions.md)
+> References: [Functions](functions.md)
 >
 > **TODO:** References need to be evolved.
 
@@ -325,10 +324,7 @@ fn Foo() {
 
 ### Variables
 
-> References: [Variables](variables.md) and
-> [syntactic conventions](syntactic_conventions.md)
->
-> **TODO:** References need to be evolved.
+> References: [Variables](variables.md)
 
 Blocks introduce nested scopes and can contain local variable declarations that
 work similarly to function parameters.

+ 1 - 1
docs/design/control_flow/return.md

@@ -22,7 +22,7 @@ execution to the caller. If the function returns a value to the caller, that
 value is provided by an expression in the return statement. For example:
 
 ```carbon
-fn Sum(Int a, Int b) -> Int {
+fn Sum(a: Int, b: Int) -> Int {
   return a + b;
 }
 ```

+ 1 - 1
docs/design/functions.md

@@ -30,7 +30,7 @@ primarily divided up into "functions" (or "procedures", "subroutines", or
 language. Let's look at a simple example to understand how these work:
 
 ```
-fn Sum(Int a, Int b) -> Int;
+fn Sum(a: Int, b: Int) -> Int;
 ```
 
 This declares a function called `Sum` which accepts two `Int` parameters, the

+ 1 - 1
docs/design/name_lookup.md

@@ -41,7 +41,7 @@ namespace Foo {
   }
 }
 
-fn F(Foo.Bar.MyInt x);
+fn F(x: Foo.Bar.MyInt);
 ```
 
 Carbon packages are also namespaces so to get to an imported name from the

+ 0 - 75
docs/design/syntactic_conventions.md

@@ -1,75 +0,0 @@
-# Syntactic conventions
-
-<!--
-Part of the Carbon Language project, under the Apache License v2.0 with LLVM
-Exceptions. See /LICENSE for license information.
-SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
--->
-
-<!-- toc -->
-
-## Table of contents
-
--   [TODO](#todo)
--   [Overview](#overview)
--   [Alternatives](#alternatives)
-    -   [Types before or after name](#types-before-or-after-name)
-
-<!-- tocstop -->
-
-## TODO
-
-This is a skeletal design, added to support [the overview](README.md). It should
-not be treated as accepted by the core team; rather, it is a placeholder until
-we have more time to examine this detail. Please feel welcome to rewrite and
-update as appropriate.
-
-## Overview
-
-Right now we expect variable syntax like: `Int x`.
-
-There are probably other syntactic conventions that can be added here, too.
-
-## Alternatives
-
-### Types before or after name
-
-While we are currently keeping types first, matching C++, there is significant
-uncertainty around the right approach here. While adding the colon improves the
-grammar by unambiguously marking the transition from type to a declared
-identifier, in essentially every other language with a colon in a similar
-position, the identifier is first and the type follows. However, that ordering
-would be very _inconsistent_ with C++.
-
-One very important consideration here is the fundamental approach to type
-inference. Languages which use the syntax `<identifier>: <type>` typically allow
-completely omitting the colon and the type to signify inference. With C++,
-inference is achieved with a placeholder keyword `auto`, and Carbon is currently
-being consistent there as well with `auto <identifier>`. For languages which
-simply allow omission, this seems an intentional incentive to encourage
-inference. On the other hand, there has been strong advocacy in the C++
-community to not overly rely on inference and to write the explicit type
-whenever convenient. Being consistent with the _ordering_ of identifier and type
-may ultimately be less important than being consistent with the incentives and
-approach to type inference. What should be the default that we teach? Teaching
-to avoid inference unless it specifically helps readability by avoiding a
-confusing or unhelpfully complex type name, and incentivizing that by requiring
-`auto` or another placeholder, may cause as much or more inconsistency with
-languages that use `<identifier>: <type>` as retaining the C++ ordering.
-
-That said, all of this is largely unknown. It will require a significant
-exploration of the trade-offs and consistency differences. It should also factor
-in further development of pattern matching generally and whether that has an
-influence on one or another approach. Last but not least, while this may seem
-like something that people will get used to with time, it may be worthwhile to
-do some user research to understand the likely reaction distribution, strength
-of reaction, and any quantifiable impact these options have on measured
-readability. We have only found one _very_ weak source of research that focused
-on the _order_ question (rather than type inference versus explicit types or
-other questions in this space). That was a very limited PhD student's study of
-Java programmers that seemed to indicate improved latency for recalling the type
-of a given variable name with types on the left (as in C++). However, those
-results are _far_ from conclusive.
-
-**TODO**: Get a useful link to this PhD research (a few of us got a copy from
-the professor directly).

+ 16 - 27
docs/design/variables.md

@@ -10,23 +10,19 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 ## Table of contents
 
--   [TODO](#todo)
 -   [Overview](#overview)
-    -   [Declaring constants](#declaring-constants)
+-   [Notes](#notes)
 -   [Alternatives](#alternatives)
-    -   [Declaring constants](#declaring-constants-1)
     -   [Global variables](#global-variables)
+-   [Relevant proposals](#relevant-proposals)
 
 <!-- tocstop -->
 
-## TODO
+## Overview
 
-This is a skeletal design, added to support [the overview](README.md). It should
-not be treated as accepted by the core team; rather, it is a placeholder until
-we have more time to examine this detail. Please feel welcome to rewrite and
-update as appropriate.
+Carbon's local variable syntax is:
 
-## Overview
+> `var` _identifier_`:` _type_ _[_ `=` _value_ _]_`;`
 
 Blocks introduce nested scopes and can contain local variable declarations that
 work similarly to function parameters.
@@ -35,7 +31,7 @@ For example:
 
 ```
 fn Foo() {
-  var Int x = 42;
+  var x: Int = 42;
 }
 ```
 
@@ -46,27 +42,12 @@ yet, but this gives you the basic idea.
 
 While there can be global constants, there are no global variables.
 
-### Declaring constants
+## Notes
 
-Constants will use template-like syntax for declarations. For example, a simple
-integer constant looks like:
-
-```carbon
-var Int$$ MyVal = 42;
-```
+> TODO: Constant syntax is an ongoing discussion.
 
 ## Alternatives
 
-### Declaring constants
-
-There is other syntax that could be used for declaring constants. There are
-serious problems with the use of `const` in C++ as part of the type system.
-Another alternative is `let` from Swift, although there are some questions
-around how intuitive it is for this to introduce a constant. Another candidate
-is `val` from Kotlin. Another thing we need to contend with is the surprise of
-const and reference (semantic) types. At present we are leaning towards the
-tempalte-like syntax for consistency within Carbon.
-
 ### Global variables
 
 We are exploring several different ideas for how to design less bug-prone
@@ -74,3 +55,11 @@ patterns to replace the important use cases programmers still have for global
 variables. We may be unable to fully address them, at least for migrated code,
 and be forced to add some limited form of global variables back. We may also
 discover that their convenience outweighs any improvements afforded.
+
+## Relevant proposals
+
+Most discussion of design choices and alternatives may be found in relevant
+proposals.
+
+-   [`var` statement](/proposals/p0339.md)
+-   [`var` ordering](/proposals/p0618.md)

+ 1 - 0
proposals/README.md

@@ -60,5 +60,6 @@ request:
 -   [0540 - Remove `Void`](p0540.md)
 -   [0555 - Operator precedence](p0555.md)
 -   [0601 - Operator tokens](p0601.md)
+-   [0618 - var ordering](p0618.md)
 
 <!-- endproposals -->

+ 166 - 0
proposals/p0618.md

@@ -0,0 +1,166 @@
+# var ordering
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/618)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Problem](#problem)
+-   [Background](#background)
+-   [Proposal](#proposal)
+-   [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
+-   [Alternatives considered](#alternatives-considered)
+    -   [Type ordering](#type-ordering)
+        -   [`<type>: <name>`](#type-name)
+        -   [`<type> <name>`](#type-name-1)
+        -   [`<name>: <type>`](#name-type)
+    -   [`:` versus `in`](#-versus-in)
+
+<!-- tocstop -->
+
+## Problem
+
+As stated by on
+[Re-evaluate core variable & parameter identifier/type order (including a default for parameters) #542](https://github.com/carbon-language/carbon-lang/issues/542):
+
+> Somewhat condensed, bullet-point-y background for this question:
+>
+> -   We've been using first `Type: variable` and then `Type variable` syntax in
+>     variables, parameters, and other declarations.
+> -   This was primarily based on a _lack of compelling data_ to select a better
+>     syntax, with trying to stay similar to C++ as a fallback.
+> -   It was _specifically_ intended to be revisited. The expected trigger for
+>     this was some form of broader user data (surveys at least of decent #s of
+>     developers, or potentially real user studies).
+> -   However, we have gained specific new information as we've filled out a
+>     great deal of the surrounding syntax. We have also gotten some data on
+>     parsing challenges (although perhaps surmountable challenges) of
+>     `Type variable`.
+> -   We also don't have a short/definite timeline to start getting useful data.
+> -   The leads should re-evaluate the core variable syntax based on the new
+>     information we have, but _without_ trying to wait for data.
+>     -   We can always re-evaluate again if and when data arrives and indicates
+>         any need for it.
+> -   The leads should do this ASAP and make a decision so that we can focus our
+>     energy, reduce frustrating discussions, and have consistent syntax in
+>     examples and proposals.
+
+## Background
+
+Background may be found in the related
+[#542](https://github.com/carbon-language/carbon-lang/issues/542) and
+[Docs](https://docs.google.com/document/d/1iuytei37LPg_tEd6xe-O6P_bpN7TIbEjNtFMLYW2Nno).
+
+## Proposal
+
+Two changes:
+
+-   Switch to `<name>: <type>`, replacing `<type>: <name>`.
+-   Use `in` instead of `:` in range-based for loops.
+
+Note these changes were largely implemented by
+[#563](https://github.com/carbon-language/carbon-lang/pull/563).
+
+## Rationale based on Carbon's goals
+
+Both of these changes are done for consistency with other modern languages,
+particularly Swift and Rust. The switch from `:` to `in` is for ease of
+understanding and parsing.
+
+## Alternatives considered
+
+### Type ordering
+
+Alternatives are pulled from
+[Docs](https://docs.google.com/document/d/1iuytei37LPg_tEd6xe-O6P_bpN7TIbEjNtFMLYW2Nno).
+
+#### `<type>: <name>`
+
+`var String: message = "Hello world";`
+
+Advantages:
+
+-   Roughly matches the order of C, C++, C#, D and Java, except with extra `var`
+    and `:`.
+-   Type at the beginning puts most important information up front.
+-   Name followed by default matches assignment statements.
+
+Disadvantages:
+
+-   Existing languages that use a `:` put the name before and the type after
+    ([universally](http://rosettacode.org/wiki/Variables)).
+-   Beyond simple inconsistency, the overlap of `:` in this syntax with
+    different order will add confusion for people working/familiar with multiple
+    languages.
+-   Does not end up having a syntax that is consistent with using colons for
+    marking labelled parameters and arguments, such as how Swift does.
+    -   We currently do not plan to use a colon syntax for labelled parameters
+        and arguments, regardless of the decision here.
+
+Opinions vary:
+
+-   Not friendly to optionally dropping the `type:` to represent auto type
+    deduction.
+
+#### `<type> <name>`
+
+`var String message = "Hello world";`
+
+Advantages
+
+-   Matches C, C++, C#, D and Java the closest.
+
+Disadvantages:
+
+-   Creates parse ambiguity, particularly when we start adding syntax to the
+    name to indicate that a parameter is labeled, etc.
+
+Currently hard to see how we can make this work, since it isn't compatible with
+other choices, detailed in
+[Docs](https://docs.google.com/document/d/1iuytei37LPg_tEd6xe-O6P_bpN7TIbEjNtFMLYW2Nno).
+
+#### `<name>: <type>`
+
+`var message: String = "Hello world";`
+
+Advantages:
+
+-   Matches [Swift](http://rosettacode.org/wiki/Variables#Swift),
+    [Rust](https://doc.rust-lang.org/stable/rust-by-example/primitives.html),
+    [Kotlin](http://rosettacode.org/wiki/Variables#Kotlin),
+    [Python3](https://docs.python.org/3/library/typing.html), and many smaller
+    languages ([Ada](http://rosettacode.org/wiki/Variables#Ada),
+    [Pascal languages like Delphi](http://rosettacode.org/wiki/Variables#Delphi)
+    and [Modula-3](http://rosettacode.org/wiki/Variables#Modula-3),
+    [Eiffel](http://rosettacode.org/wiki/Variables#Eiffel),
+    [Nim](https://nim-lang.org/docs/tut1.html#the-var-statement),
+    [Pony](http://rosettacode.org/wiki/Variables#Pony),
+    [Zig](https://ziglang.org/documentation/0.7.1/#Variables)).
+-   Names will line up better with method names in a `struct` definition.
+
+Disadvantages:
+
+-   Name separated from initializer; default doesn't match assignment
+    statements.
+-   Further from the simplistic appearance of common C and C++ variable
+    declarations.
+
+Opinions vary:
+
+-   Existing languages typically make the "`: type`" part optional when an
+    "`= value`" clause is present.
+
+### `:` versus `in`
+
+The `:` operator for range-based for loops becomes harder to read, and more
+likely to cause ambiguity, when `:` is also used for `var`. That is,
+`for (var i: Int : list)` is just harder to understand than
+`for (var i: Int in list)`. `in` is a favorable choice for its use in other
+languages.