for statement and user typesThis proposes an interface that can be implemented by user types to support
range-based iteration with for.
The current for proposal does not define a way for types to support
range-based for loops.
Examples of use cases for for loops include iterating over:
The goals for the solution includes:
The original proposal of for loops were discussed in
p0353, and the basic design documented in
control_flow/loops#for.
The resulting syntax was:
for (var declarationinexpression) {statements}
We now need a way to allow supporting this syntax for user and library container types.
This section highlights some examples of how iterators and range-based for user-defined types is addressed in different languages.
C++ requires either .begin()/.end() methods, or
begin(<container>)/end(<container>) to be supported for
range-based for.
Python requires 2 dunder methods, __iter__ on the container and __next__ on
the iterator, to be usable with a for <element> in <container>: construct.
Generator functions act like iterators.
They are executed until the next yield statement, whose argument is returned
as the next value to the for loop. Iteration ends when the function returns.
def generate_fibonacci():
yield 0;
n1, n2 = 0, 1;
while True:
n1, n2 = n2, n1+n2;
yield n1;
for n in generate():
print(n);
Rust supports iterating over a container with for using
for n in 1..100for element in container.iter() or .iter_mut()The Iterator type can be used with custom types using the
following syntax
(though specific):
struct Fibonacci {
curr: u32,
next: u32,
}
impl Iterator for Fibonacci {
type Item = u32;
fn next(&mut self) -> Option<Self::Item> {
// calculate and yield value
[...]
}
}
for i in fibonacci().take(4) {
...
}
Typescript uses the
.iterator property,
along with
two forms
of range-based for loops: for..in and for..of.
let pets = new Set(['Cat', 'Dog', 'Hamster']);
pets['species'] = 'mammals';
for (let pet in pets) {
console.log(pet); // "species"
}
for (let pet of pets) {
console.log(pet); // "Cat", "Dog", "Hamster"
}
Using the .iterator property, Typescript allow to lazily generate data easily
using functions, classes, or objects:
const reverse = (arr) => ({
[Symbol.iterator]() {
let i = arr.length;
return {
next: () => ({
value: arr[--i],
done: i < 0,
}),
};
},
});
Go does not officially support customizing for for user types.
Provide interfaces that can be implemented by user types to enable support for ranged-for loops.
We propose:
Iterate interface to support for loopsfor loop value let by default, that is, non-reassignableBelow is a high-level example of this concept:
// Add support for `for`
class MyRange {
impl as Iterate where .ElementType = i32 and .CursorType = i32 {
// Implement `Iterate`
}
}
// Usage
let my_range: MyRange = {...};
for (item: auto in my_range) {
// Do something with `item`
}
for with immutable valuesIterate interfaceThis proposal exposes a new Iterate. This interface:
ElementType and CursorType.Next(cursor)
method that returns an optional value.The Iterate interface is:
interface Iterate {
let ElementType:! type;
let CursorType:! type;
fn NewCursor[self: Self]() -> CursorType;
fn Next[self: Self](cursor: CursorType*) -> Optional(ElementType);
}
A naive Carbon implementation of the for loop could be:
var cursor: range.(Iterate.CursorType) = range.(Iterate.NewCursor)();
var iter: Optional(range.(Iterate.ElementType)) = range.(Iterate.Next)(&cursor);
// A. Possible implementation
while (iter.HasValue()) {
ExecuteForBlock(iter.Get());
iter = container.(Iterate.Next)(&cursor);
}
// B. Possible alternative depending on the API for Optional(T):
while (true) {
match (iter) {
case .None => { break; }
case .Some(let value: range.(Iterate.ElementType)) => {
ExecuteForBlock(value);
iter = container.(Iterate.Next)(&cursor);
}
}
}
The cursor tracks progression, and the Next method advances the cursor and
returns an Optional value. An empty Optional indicates that we have reached
the end.
Below is a sample implementation of that interface for a user type.
class MyIntContainer {
var data: ...
fn Size[self: Self]() -> i32 {
return 4;
}
impl as Iterate where .ElementType = i32 and .CursorType = i32 {
fn NewCursor[self: Self]() -> i32 { return 0; }
fn Next[self: Self](cursor: i32*) -> Optional(i32) {
// Advance and return value, or return empty
if (*cursor < self.Size()) {
let index: i32 = *cursor;
*cursor = index + 1;
return Optional(i32).Create(self.data[index]);
} else {
return Optional(i32).CreateEmpty();
}
}
}
}
Which can be used as follow:
fn Main() -> i32 {
var container: MyIntContainer = {.data = (1, 2, 3, 4)};
// Implicit `let` value declaration
for (value: i32 in container) {
Print("{0}", value);
}
return 0;
}
let by defaultTo keep usage simple and non-surprising, we propose defaulting variable
declarations to let declarations. This is consistent with function parameters
and scrutinees for
pattern matching,
which are immutable by default.
// Implicitly `let item: T`
for (item: T in my_range) {
// `item` cannot be reassigned
}
This default can be overridden by adding the var keyword:
for (var item: T in my_range) {
// `item` can be modified, but this will not modify `my_range`
// because it is a copy of the element with its own storage
}
:It will be important to ensure that temporary entities on the right-hand side of
: are alive during the execution of the for loop to prevent invalid memory
access (Note: this was only recently addressed for C++ by
P2644).
This applies to Carbon and C++ interoperability.
// Example using a temporary range, with `GetElements()` returning an object that implements `Iterate`
for (item: T in bucket.GetElements()) {
// We need to make sure the object is alive for the duration of the `for` loop
}
This section covers the use cases highlighted in the introduction, and usage with the proposed design.
This proposal covers strictly immutable elements, and proposes defaulting
element declaration to let, to minimize surprises and clarify the intent.
Mutable elements are mentioned in the Future work section.
For random access containers, the cursor would represent a numerical index,
incremented within Next().
Because the cursor type is defined by the developer, and opaque to the user, it could be a hashmap index, a string, or pointer to a node, without changing the usage for users.
The ElementType can be a tuple, such as a (key, value) for maps, or a single
value. See [Future work][#future-work] for other examples.
R-values are likely to be limited in terms of addressing and mutation. The impact is undefined until the corresponding design is finalized.
The cursor approach used for Iterate has the slight advantage of not requiring
pointers (and addressing) for some container types. This may be beneficial as a
first iteration, until the dependent design is updated, at which point we will
have to define how to handle this special case.
Next() can be trivially implemented to return a single computed value
on-demand based on the cursor, without the need for storage. If the sequence is
infinite, Next() will never return an empty Optional, and the for loop
will continue until interrupted by the user.
class SquaredSequence {
impl as Iterate where .ElementType = i32 and .CursorType = i32 {
fn NewCursor[self: Self]() -> i32 { return -1; }
fn Next[self: Self](cursor: i32*) -> Optional(i32) {
// Advance and return computed value
*cursor = *cursor + 1;
return Optional(i32).Create((*cursor) * (*cursor));
}
}
}
The proposal does not present any limitation regarding large collections. The cursor can be adapted to the collection size, and no copy of the container should occur both for immutable collections, and future mutable elements.
Provided the traversal order is the same, generic filters and transformers are
supported by implementing a proxy Iterate interface, internally or externally.
This example illustrates a simple proxy interface.
class BasicFilter(T:! Iterate) {
var seq: T*;
var value: T.ElementType;
impl as Iterate where .ElementType = T.ElementType and .CursorType = T.CursorType {
fn NewCursor[self: Self]() -> T.ElementType { return self.seq->NewCursor(); }
fn Next[self: Self](cursor: T.CursorType*) -> Optional(T.ElementType) {
var res: auto = self.seq->Next(cursor);
while (res.has_value() && res.get() == value) {
res = self.seq->Next(cursor);
}
return res;
}
}
}
fn Main() -> i32 {
let container: MyIntContainer = MyIntContainer.Create(0,1,0,2,3,0);
let no_zeros: BasicFilter(MyIntContainer) =
{ .seq = &container, .value = 0 };
// Prints "123"
for (v: i32 in no_zeros) {
Print("{0}", v);
}
return 0;
}
On the other hand, changing how the collection is traversed (for example
reverse), requires internal knowledge about the data layout, and a custom
implementation. One approach for containers that support reverse iteration would
be to implement a different interface with a Prev() method. A proxy could be
exposed to make it usable with for directly.
Below is an example of what such an interface could look like:
interface IterateBidirectional;
class ReverseProxy(T:! IterateBidirectional);
interface IterateBidirectional {
extends Iterate;
fn NewEndCursor[self: Self]() -> CursorType;
fn Prev[self: Self](cursor: CursorType*) -> Optional(ElementType);
final fn Reversed[self: Self]() -> ReverseProxy(Self);
}
class ReverseProxy(T:! IterateBidirectional) {
var container: T;
impl as Iterate where .ElementType = T.ElementType and .CursorType = T.CursorType {
fn NewCursor[self: Self]() -> CursorType {
return self.container.NewEndCursor();
}
fn Next[self: Self](cursor: CursorType*) -> Optional(ElementType) {
return self.container.Prev(cursor);
}
}
}
fn IterateBidirectional.Reversed[self: Self]() -> ReverseProxy(Self) {
return {.container = self};
}
/// Usage
class MyContainer {
fn Create(...) { ... }
impl as Iterate ... { ... }
impl as IterateBidirectional ... { ... }
}
var container: MyContainer = MyContainer.Create(...);
for (v: container.(Iterate.ElementType) in container.Reversed()) {...}
The cursor approach deviates from the C++ iterator approach, requiring some adaptation.
Given the current state of the design, C++ interoperability is yet to be defined, and suggestions have been highlighted in the Future work section.
for with mutable valuesSupporting mutable values could be achieved using a second interface that
provides a proxy or view for the data, exposing an Iterate interface with
ElementType*
// Object implementing `MutableIterate` returned by a function
for (p: T* in my_range.AddrRange()) {}
The example MutableIterate interface below acts as a proxy for the collection,
and implements our Iterate interface.
interface MutableIterate {
impl as Iterate;
alias ElementType = Iterate.ElementType;
let ProxyType:! Iterate where .ElementType = ElementType*
and .CursorType is ImplicitAs((Self as Iterate).CursorType);
fn AddrRange[addr self: Self*]() -> ProxyType;
}
class MyIntContainer {
//...
impl as Iterate ...
external impl as MutateIterate where .ProxyType = Self {
...
}
}
Mixins are currently in early design stages. This section highlights possible uses speculating on the final design. Some may include:
self, limiting needs for pointers and address resolutionNamed mixins, used as programmable members, could expose a view with mutable values:
// Using a named mixin to expose mutable values
for (p: my_range.(Iterate.ElementType)* in my_range.mutable) {}
Used in this way, this clarifies that this a view of the data, like an attribute, with no direct side effects, when compared to a method.
Similarly, mixins could allow supporting other views into the container.
For example, maps could expose a view with "key" and "value" only, in addition to the default (key, value) tuple.
// Using a named mixin to expose mutable values
for (k: my_dict.(Iterate.CursorType) in my_dict.keys) {}
And "reversed" view could also be used for reverse iteration:
// Reversing
for (v: my_dict.(Iterate.ElementType) in my_dict.reversed) {}
Large elements would have a negative impact if the Optional cannot leverage
unused bit patterns within the type, nor present a view instead of a copy.
If this proves difficult, we can let implementers of the container choose
whether to expose values or pointers to values. When electing to "pointers to
values", an adapter can be provided to support transforming the range of
pointers into a range a values. This choice allows supporting either iteration
over container r-values (ElementType = T), or efficient iteration
(ElementType = T*), but not both. While acceptable as an intermediate
solution, a better alternative will be needed in the future.
adapter IterateByValue(T:! Iterate where .CursorType impls Deref) for T {
impl as Iterate
where .CursorType = T.CursorType
and .ElementType = T.CursorType.(Deref.Result) {
...
}
}
// Used like:
var frames: NetworkFramesContainerByPtr = ...
for (i: NetworkFrame in frames as IterateByValue(NetworkFramesContainerByPtr))
{
...
}
Alternatively, we can allow the binding in the for loop to be an addr
pattern, implemented by calling into a separate interface on the container. This
puts the choice of using values or pointers to values in the hands of the for
loop author. While this deviates from the ergonomics of let (where the
compiler can choose to copy or alias the value depending on what is more
efficient), this option would provide parity with C++ for loops (where the
author chooses between value and reference).
For reference, range-based for loops in C++ requires:
begin() and end() methods or free functions, and++, indirection *, and
inequality != operationsSee range-based for statement for more details.
A limitation of the approach proposed in this document is the need for adaptation to interoperate with the C++ iterator model. The adapter would be a Carbon type that satisfies the Cursor interface, implemented in terms of a C++ iterator.
NewCursor providing the "start iterator" returned by begin()Next doing the "increment, bounds check, and returning value"Two different approaches are identified:
impl of Iterate, for anything with the right
method signatures (begin, end, and so on)Another is a templated adapter that implements Iterate when requested
// Template impl approach
template constraint CppIterator { ... }
template constraint CppContainer {
// Speculative
for_unique IteratorType: CppIterator;
fn begin() -> IteratorType;
fn end() -> IteratorType;
}
external impl forall [template T:! CppContainer] T as Iterate
where .CursorType = T.(CppContainer.IteratorType)
and .ElementType = .CursorType.ElementType {
fn NewCursor[self: Self]() -> CursorType {
return self.(CppContainer.begin)();
}
fn Next[self: Self](cursor: CursorType*) -> Optional(ElementType) {
if (*cursor == self.(CppContainer.end)()) {
return Optional(ElementType).MakeEmpty();
}
let value: ElementType = **cursor;
++*cursor;
return Optional(ElementType).MakeValue(value);
}
}
// Used like:
var v: Cpp.std.vector(i32) = ...
for (i: i32 in v) { ... }
// Adapter approach
adapter CppIterate(template T:! type) for T {
impl as Iterate
// Speculative syntax. Alternatively, a cursor type parameter
// would avoid the need for deduction.
where .CursorType = typeof(T.begin())
and .ElementType = .CursorType.(Deref.Result) {
fn NewCursor[self: Self]() -> CursorType {
return self.begin();
}
fn Next[self: Self](cursor: CursorType*) -> Optional(ElementType) {
if (*cursor == self.end()) {
return Optional(ElementType).MakeEmpty();
}
let value: ElementType = **cursor;
++*cursor;
return Optional(ElementType).MakeValue(value);
}
}
}
// Used like:
var v: Cpp.std.vector(i32) = ...
for (i: i32 in v as CppIterate(Cpp.std.vector(i32))) { ... }
These two options could allow interoperability with a compatible C++ type, either implicitly (template approach) or explicitly (adapter approach).
There need to be begin() and end() methods or free functions accessible to
C++ for any Carbon type that implements the Iterate interface, in addition to
overloads of the *, ++, and != operators for the types returned.
The Iterate interface does not expose a way to have a cursor nor an iterator
to the past-last element. An option would be to have end() return a predefined
invalid value, to satisfy iterator comparison when Next() returns an empty
Optional.
Below is sample implementation to iterate on a Carbon Iterate. Note that we
need to copy and cache the last value, to allow for *it to be called. This
puts a requirement on ElementType to be copyable for this interoperability to
be usable.
namespace Carbon {
// A C++ input iterator for any type that implements the Carbon `Iterate` interface.
template <typename Iterate>
class IterateIterator {
public:
// Returns an "Invalid" iterator representing `end()`
static auto Invalid(const Iterate& container) -> IterateIterator<Iterate> {
return IterateIterator<Iterate>(container, std::nullopt);
}
IterateIterator(const Iterate& container,
std::optional<typename Iterate::CursorType> cursor)
: container_(&container), cursor_(cursor) {
if (cursor_) {
next();
}
}
IterateIterator(const IterateIterator& other)
: container_(other.container_), cursor_(other.cursor_),
value_(other) {}
auto operator*() const -> const typename Iterate::ElementType& {
assert(cursor_);
return value_;
}
auto operator++() -> IterateIterator<Iterate>& {
next();
return *this;
}
auto operator==(const IterateIterator<Iterate>& rhs) -> bool {
return container_ == rhs.container_ && cursor_ == rhs.cursor_;
}
auto operator!=(const IterateIterator<Iterate>& rhs) -> bool {
return !(*this == rhs);
}
private:
void next() {
assert(cursor_);
if (const auto v = container_->Next(*cursor_); v) {
value_ = *std::move(v);
} else {
cursor_ = std::nullopt;
}
}
const Iterate* container_;
std::optional<typename Iterate::CursorType> cursor_;
typename Iterate::ElementType value_;
};
// The C++ side of a custom Carbon type implementing
// `Iterate` with CursorType `C`, and ElementType `T`
// could have `begin()` and `end()` methods defined as:
class MyIterateType {
// [...]
auto begin() -> IterateIterator<Iterate<C, T>> {
return IterateIterator<Iterate<C, T>>(*this, NewCursor());
}
auto end() -> IterateIterator<Iterate<C, T>> {
return IterateIterator<Iterate<C, T>>::Invalid(*this);
}
};
} // namespace Carbon
// Usage:
Carbon::MyIterateType container;
for (auto s : container) { /*...*/ }
Using an Optional has downsides, discussed in the
"alternatives considered" section. A possible
approach would be to invert the control of the for loop, by having the
container call in the for block for each value, passing the value as argument.
This would work similarly to Python generator functions.
This has the following advantages:
Optional, and the associated overheads, copies,
and unwrapping.for levelLimitations:
for body will needs to be available or providedThis proposal offers a first step to support a needed way to iterate using a
for loop. More specifically:
Next method,
uses cursors instead of their more complex iterator counterpart, and defines
loop values as let by default to be less error prone
(Code that is easy to read, understand, and write).IterateAn option was to define methods such as Get, Advance, IsAtEnd on
Iterate, instead of a unique Next methods. While being more atomic, the
concern is that these individual methods will all need to be called separately
while iterating, while a single call to Next performs these three operations
together, providing more opportunities for optimization.
Using a single operation to return an optional is also consistent with Rust’s approach.
The combined Next method forces us to return an Optional, which has
downsides:
Optional itself, andThose may be mitigated by leveraging unused bit patterns, passing a view of the value, or inverting control.
The iterator approach was considered but proved to have key limitations that a cursor approach does not have:
Next() approach.On the other hand, the cursor approach deviates more from C++ than iterator, and may require some more effort for C++ interoperability.
T and T* with IterateBecause some collections will be read-only, we want to separate mutable and immutable interfaces. A single interface would force developers to implement or stub the mutable portion of the interface, even if it is not applicable.