|
|
@@ -0,0 +1,357 @@
|
|
|
+# `for` loops
|
|
|
+
|
|
|
+<!--
|
|
|
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
|
|
|
+Exceptions. See /LICENSE for license information.
|
|
|
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
+-->
|
|
|
+
|
|
|
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/353)
|
|
|
+
|
|
|
+<!-- toc -->
|
|
|
+
|
|
|
+## Table of contents
|
|
|
+
|
|
|
+- [Problem](#problem)
|
|
|
+- [Background](#background)
|
|
|
+ - [C++](#c)
|
|
|
+ - [Java](#java)
|
|
|
+ - [TypeScript and JavaScript](#typescript-and-javascript)
|
|
|
+ - [Python, Swift, and Rust](#python-swift-and-rust)
|
|
|
+ - [Go](#go)
|
|
|
+- [Proposal](#proposal)
|
|
|
+- [Details](#details)
|
|
|
+ - [Range inputs](#range-inputs)
|
|
|
+ - [Executable semantics form](#executable-semantics-form)
|
|
|
+- [Caveats](#caveats)
|
|
|
+ - [C++ as baseline](#c-as-baseline)
|
|
|
+ - [Semisemi support](#semisemi-support)
|
|
|
+ - [Range literals](#range-literals)
|
|
|
+ - [Enumerating containers](#enumerating-containers)
|
|
|
+- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
|
|
|
+- [Alternatives considered](#alternatives-considered)
|
|
|
+ - [Include semisemi `for` loops](#include-semisemi-for-loops)
|
|
|
+ - [Writing `in` instead of `:`](#writing-in-instead-of-)
|
|
|
+ - [Multi-variable bindings](#multi-variable-bindings)
|
|
|
+
|
|
|
+<!-- tocstop -->
|
|
|
+
|
|
|
+## Problem
|
|
|
+
|
|
|
+Control flow is documented at
|
|
|
+[language overview](/docs/design/README.md#control-flow). `for` loops are common
|
|
|
+in C++, and Carbon should consider providing some form of it.
|
|
|
+
|
|
|
+## Background
|
|
|
+
|
|
|
+### C++
|
|
|
+
|
|
|
+There are two forms of `for` loops in C++:
|
|
|
+
|
|
|
+- **Semisemi** (semicolon, semicolon): `for (int i = 0; i < list.size(); ++i)`
|
|
|
+- **Range-based**: `for (auto x : list)`
|
|
|
+
|
|
|
+Semisemi `for` loops have been around for a long time, and are in C. Range-based
|
|
|
+`for` loops were added in C++11.
|
|
|
+
|
|
|
+For example, here is a basic semisemi:
|
|
|
+
|
|
|
+```cc
|
|
|
+for (int i = 0; i < list.size(); ++i) {
|
|
|
+ printf("List at %d: %s\n", i, list[i].name);
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+An equivalent semisemi using iterators and the comma operator may look like:
|
|
|
+
|
|
|
+```cc
|
|
|
+int i = 0;
|
|
|
+for (auto it = list.begin(); it != list.end(); ++it, ++i) {
|
|
|
+ printf("List at %d: %s\n", i, it->name);
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+Range-based syntax can be simpler, but can also make it more difficult if there
|
|
|
+are multiple pieces of interesting information:
|
|
|
+
|
|
|
+```cc
|
|
|
+int i = 0;
|
|
|
+for (const auto& x : list) {
|
|
|
+ printf("List at %d: %s\n", i, x.name);
|
|
|
+ ++i;
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+### Java
|
|
|
+
|
|
|
+Java provides equivalent syntax to C++. Although Java doesn't have a comma
|
|
|
+operator, it does provide for comma-separated statements in the first and third
|
|
|
+sections of semisemi for loops.
|
|
|
+
|
|
|
+### TypeScript and JavaScript
|
|
|
+
|
|
|
+Both TypeScript and JavaScript offer three kinds of for loops:
|
|
|
+
|
|
|
+- Semisemi, mirroring C++.
|
|
|
+- `for (x of list)`, mirroring range-based for loops.
|
|
|
+- `for (x in list)`, returning indices.
|
|
|
+
|
|
|
+For example, here is an `in` loop:
|
|
|
+
|
|
|
+```javascript
|
|
|
+for (i in list) {
|
|
|
+ console.log('List at ' + i + ': ' + list[i].name);
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+### Python, Swift, and Rust
|
|
|
+
|
|
|
+Python, Swift, and Rust all only support range-based for loops, using
|
|
|
+`for x in list` syntax.
|
|
|
+
|
|
|
+### Go
|
|
|
+
|
|
|
+Go uses `for` as its primary looping construct. It has:
|
|
|
+
|
|
|
+- Semisemi, mirroring C++.
|
|
|
+- `for i < list.size()` condition-only loops, mirroring C++ `while` loops.
|
|
|
+- `for {` infinite loops.
|
|
|
+
|
|
|
+## Proposal
|
|
|
+
|
|
|
+Carbon should adopt C++-style range-based `for` loops syntax. Semisemi `for`
|
|
|
+loops should be addressed through a different mechanism.
|
|
|
+
|
|
|
+Related keywords are:
|
|
|
+
|
|
|
+- `for`
|
|
|
+- `continue`: continues with the next loop iteration.
|
|
|
+- `break`: breaks out of the loop.
|
|
|
+
|
|
|
+## Details
|
|
|
+
|
|
|
+For loop syntax looks like: `for (` `var` _type_ _variable_ `:` _expression_
|
|
|
+`) {` _statements_ `}`
|
|
|
+
|
|
|
+Similar to the
|
|
|
+[if/else proposal](https://github.com/carbon-language/carbon-lang/pull/285), the
|
|
|
+braces are optional and must be paired (`{ ... }`) if present. When there are no
|
|
|
+braces, only one statement is allowed.
|
|
|
+
|
|
|
+`continue` will continue with the next loop iteration directly, skipping any
|
|
|
+other statements in the loop body.
|
|
|
+
|
|
|
+`break` exits the loop immediately.
|
|
|
+
|
|
|
+All of this is consistent with C/C++ behavior.
|
|
|
+
|
|
|
+### Range inputs
|
|
|
+
|
|
|
+The syntax for inputs is not being defined in this proposal. However, we can
|
|
|
+still establish critical things to support:
|
|
|
+
|
|
|
+- Interoperable C++ objects that work with C++'s range-based `for` loops, such
|
|
|
+ as containers with iterators.
|
|
|
+- Carbon arrays and other containers.
|
|
|
+- Range literals. These are not proposed, but for an example seen in other
|
|
|
+ languages, `0..2` may indicate the set of integers [0, 2).
|
|
|
+
|
|
|
+### Executable semantics form
|
|
|
+
|
|
|
+```bison
|
|
|
+%token FOR
|
|
|
+
|
|
|
+statement:
|
|
|
+ FOR "(" pattern ":" expression ")" statement
|
|
|
+| /* pre-existing statements elided */
|
|
|
+;
|
|
|
+```
|
|
|
+
|
|
|
+The `continue` and `break` statements are intended to be added as part of the
|
|
|
+[while proposal](https://github.com/carbon-language/carbon-lang/pull/340).
|
|
|
+
|
|
|
+## Caveats
|
|
|
+
|
|
|
+### C++ as baseline
|
|
|
+
|
|
|
+This baseline syntax is based on C++, following the migration sub-goal
|
|
|
+[Familiarity for experienced C++ developers with a gentle learning curve](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code).
|
|
|
+To the extent that this proposal anchors on a particular approach, it aims to
|
|
|
+anchor on C++'s existing syntax, consistent with that sub-goal.
|
|
|
+
|
|
|
+Alternatives will generally reflect breaking consistency with C++ syntax. While
|
|
|
+most proposals may consider alternatives more, this proposal suggests a
|
|
|
+threshold of only accepting alternatives that skew from C++ syntax if they are
|
|
|
+clearly better; the priority in this proposal is to _avoid debate_ and produce a
|
|
|
+trivial proposal. Where an alternative would trigger debate, it should be
|
|
|
+examined by an advocate in a separate proposal.
|
|
|
+
|
|
|
+### Semisemi support
|
|
|
+
|
|
|
+Carbon will not provide semisemi support. This decision will be contingent upon
|
|
|
+a better alternative loop structure which is not currently provided by `while`
|
|
|
+or `for` syntax. If Carbon doesn't evolve a better solution, semisemi support
|
|
|
+will be added later.
|
|
|
+
|
|
|
+For details, see [the alternative](#include-semisemi-for-loops).
|
|
|
+
|
|
|
+### Range literals
|
|
|
+
|
|
|
+Range literals are important to the ergonomics of range-based `for` loops, and
|
|
|
+should be added. However, they should be examined separately as part of limiting
|
|
|
+the scope of this proposal.
|
|
|
+
|
|
|
+### Enumerating containers
|
|
|
+
|
|
|
+Several languages have the concept of providing an index with the object in a
|
|
|
+range-based for loop:
|
|
|
+
|
|
|
+- Python does `for i, item in enumerate(items)`, with a global function.
|
|
|
+- Go does `for i, item := range items`, with a keyword.
|
|
|
+- Swift does `for (i, item) in items.enumerated()`, having removed a
|
|
|
+ `enumerate()` global function.
|
|
|
+- Rust does `for (i, item) in items.enumerate()`.
|
|
|
+
|
|
|
+An equivalent pattern for Carbon should be examined separately as part of
|
|
|
+limiting the scope of this proposal.
|
|
|
+
|
|
|
+## Rationale based on Carbon's goals
|
|
|
+
|
|
|
+Relevant goals are:
|
|
|
+
|
|
|
+- [3. Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write):
|
|
|
+
|
|
|
+ - Range-based `for` loops are easy to read and very helpful.
|
|
|
+ - Semisemi `for` syntax is complex and can be error prone for cases where
|
|
|
+ range-based loops work. Avoiding it, even by providing equivalent syntax
|
|
|
+ with a different loop structure, should discourage its use and direct
|
|
|
+ engineers towards better options. The alternative syntax should also be
|
|
|
+ easier to understand than semisemi syntax, otherwise we should just keep
|
|
|
+ semisemi syntax.
|
|
|
+
|
|
|
+- [7. Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code):
|
|
|
+
|
|
|
+ - Keeping syntax close to C++ will make it easier for developers to
|
|
|
+ transition.
|
|
|
+
|
|
|
+## Alternatives considered
|
|
|
+
|
|
|
+Both alternatives from the
|
|
|
+[`if`/`else` proposal](https://github.com/carbon-language/carbon-lang/pull/285)
|
|
|
+apply to `while` as well: we could remove parentheses, require braces, or both.
|
|
|
+The conclusions mirror here in order to avoid a divergence in syntax.
|
|
|
+
|
|
|
+Additional alternatives follow.
|
|
|
+
|
|
|
+### Include semisemi `for` loops
|
|
|
+
|
|
|
+We could include semisemi for loops for greater consistency with C++.
|
|
|
+
|
|
|
+This is in part important because switching from a semisemi `for` loop to a
|
|
|
+`while` loop is not always straightforward due to how `for` evaluates the third
|
|
|
+section of the semisemi. The inter-loop evaluation of the third section is
|
|
|
+important given how it interacts with `continue`. In particular, consider the
|
|
|
+loops:
|
|
|
+
|
|
|
+```cc
|
|
|
+for (int i = 0; i < 3; ++i) {
|
|
|
+ if (i == 1) continue;
|
|
|
+ printf("%d\n", i);
|
|
|
+}
|
|
|
+
|
|
|
+int j = 0;
|
|
|
+while (j < 3) {
|
|
|
+ if (j == 1) continue;
|
|
|
+ printf("%d\n", j);
|
|
|
+ ++j;
|
|
|
+}
|
|
|
+
|
|
|
+int k = 0;
|
|
|
+while (k < 3) {
|
|
|
+ ++k;
|
|
|
+ if (k == 1) continue;
|
|
|
+ printf("%d\n", k);
|
|
|
+}
|
|
|
+
|
|
|
+int l = 0;
|
|
|
+while (l < 3) {
|
|
|
+ if (l == 1) {
|
|
|
+ ++l;
|
|
|
+ continue;
|
|
|
+ }
|
|
|
+ printf("%d\n", l);
|
|
|
+ ++l;
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+To explain the differences between these loops:
|
|
|
+
|
|
|
+- The first loop will print 0 and 2.
|
|
|
+- The second loop will print 0, then loop infinitely because the increment is
|
|
|
+ never reached.
|
|
|
+- The third loop will only print 2 because the increment happens too early.
|
|
|
+- Only the fourth loop is equivalent to the first loop, and it duplicates the
|
|
|
+ increment.
|
|
|
+
|
|
|
+There is no easy place to put the increment in a `while` loop.
|
|
|
+
|
|
|
+Advantages:
|
|
|
+
|
|
|
+- We need a plan for
|
|
|
+ [migrating both developers and code from C++](https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
|
|
|
+ semisemis `for` loops, and providing them in Carbon is the easiest solution.
|
|
|
+ - Semisemis remain common in C++ code.
|
|
|
+- Semisemis are much more flexible than range-based `for` loops.
|
|
|
+ - `while` loops do not offer a sufficient alternative.
|
|
|
+
|
|
|
+Disadvantages:
|
|
|
+
|
|
|
+- Semisemi loops can be error prone, such as `for (int i = 0; i < 3; --i)`.
|
|
|
+ - Syntax such as `for (int x : range(0, 3))` leaves less room for
|
|
|
+ developer mistakes.
|
|
|
+ - Removing semisemi syntax will likely improve
|
|
|
+ [understandability of Carbon code](https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write),
|
|
|
+ a language goal.
|
|
|
+- If we add semisemi loops, it would be very difficult to get rid of them.
|
|
|
+ - Code using them should be expected to accumulate quickly, from both
|
|
|
+ migrated code and developers familiar with C++ idioms.
|
|
|
+
|
|
|
+If we want to remove `for` loops, we should avoid adding them. We do need to
|
|
|
+ensure that developers are _happy_ with the replacement, although that should be
|
|
|
+achievable through providing strong range support, including range literals.
|
|
|
+
|
|
|
+A story for migrating developers and code is still required. For developers, it
|
|
|
+would be ideal if we could have a compiler error that detects semisemi loops and
|
|
|
+advises the preferred Carbon constructs. For both developers and code, we need a
|
|
|
+suitable loop syntax that is easy to use in cases that remain hard to write in
|
|
|
+`while` or range-based `for` loops. This will depend on a separate proposal, but
|
|
|
+there's at least presently interest in this direction.
|
|
|
+
|
|
|
+### Writing `in` instead of `:`
|
|
|
+
|
|
|
+Range-based for loops could write `in` instead of `:`, such as:
|
|
|
+
|
|
|
+```carbon
|
|
|
+for (x in list) {
|
|
|
+ ...
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+An argument for switching _now_, instead of using
|
|
|
+[C++ as a baseline](#c-as-baseline), would be that `var` syntax has been
|
|
|
+discussed as using a `:`, and avoiding `:` in range-based for loops may reduce
|
|
|
+syntax ambiguity risks. However, the
|
|
|
+[current `var` proposal](https://github.com/carbon-language/carbon-lang/pull/339)
|
|
|
+does not use a `:`, and so this risk is only a potential future concern: it's
|
|
|
+too early to require further evaluation.
|
|
|
+
|
|
|
+Because the benefits of this alternative are debatable and would diverge from
|
|
|
+C++, adopting `in` would run contrary to
|
|
|
+[using C++ as a baseline](#c-as-a-baseline). Any divergence should be justified
|
|
|
+and reviewed as a separate proposal.
|
|
|
+
|
|
|
+### Multi-variable bindings
|
|
|
+
|
|
|
+C++ allows `for (auto [x, y] : range_of_pairs)` which is not explicitly part of
|
|
|
+the syntax here. Carbon is likely to support this through tuples, so adding
|
|
|
+special `for` syntax for this would likely be redundant.
|