Просмотр исходного кода

Update sum types design (#2187)

This proposal updates the design of [p0157](https://github.com/carbon-language/carbon-lang/blob/trunk/proposals/p0157.md) to reflect subsequent evolution of the language.

Resolves #1805 

Co-authored-by: Richard Smith <richard@metafoo.co.uk>
Geoff Romer 3 лет назад
Родитель
Сommit
44ba541f1f
3 измененных файлов с 318 добавлено и 0 удалено
  1. 1 0
      docs/design/README.md
  2. 244 0
      docs/design/sum_types.md
  3. 73 0
      proposals/p2187.md

+ 1 - 0
docs/design/README.md

@@ -2049,6 +2049,7 @@ choice LikeABoolean { False, True }
 
 > References:
 >
+> -   [Sum types](sum_types.md)
 > -   Proposal
 >     [#157: Design direction for sum types](https://github.com/carbon-language/carbon-lang/pull/157)
 > -   Proposal

+ 244 - 0
docs/design/sum_types.md

@@ -0,0 +1,244 @@
+# Sum types
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Overview](#overview)
+-   [`choice` declarations](#choice-declarations)
+-   [User-defined sum types](#user-defined-sum-types)
+-   [Alternatives considered](#alternatives-considered)
+-   [References](#references)
+
+<!-- tocstop -->
+
+## Overview
+
+In Carbon, a _sum type_ is a type whose values are grouped into several distinct
+named cases, called _alternatives_. A value of a sum type notionally consists of
+a _discriminator_ tag that identifies which alternative is present, together
+with that alternative's value if it has one. Sum types are typically handled
+with pattern matching.
+
+## `choice` declarations
+
+The `choice` keyword is used to declare a sum type by specifying its interface,
+leaving the implementation to the compiler. A `choice` declaration consists of
+the `choice` keyword followed by the name of the type, and then a list of
+alternatives inside curly braces. An alternative declaration consists of the
+alternative name, optionally followed by a parameter list in parentheses. If
+present, the parameter list has the same syntax as in a
+[function declaration](README.md#functions). For example:
+
+```carbon
+choice OptionalI32 {
+  Some(value: i32),
+  None
+}
+```
+
+This declares a sum type named `OptionalI32` with two alternatives: `Some`,
+which holds a single `i32` value (the parameter name `value` has no effect other
+than documentation), and `None`, which is empty. Choice types can also be
+parameterized, [like class types](generics/details.md#parameterized-types):
+
+```carbon
+choice Optional(T:! Type) {
+  Some(value: T),
+  None
+}
+```
+
+A value of a function-like alternative is specified by "calling" it like a
+function, and a value of an empty alternative like `None` is specified by naming
+it:
+
+```carbon
+var my_opt: Optional(i32) = Optional(i32).None;
+my_opt = Optional(i32).Some(42);
+```
+
+The value of a choice type can be inspected using a `match` statement:
+
+```carbon
+match (my_opt) {
+  case .Some(the_value: i32) => {
+    Print(the_value);
+  }
+  case .None => {
+    Print("None");
+  }
+}
+```
+
+## User-defined sum types
+
+`choice` declarations are a convenience shorthand for common use cases, but they
+have limited flexibility. There is no way to control the representation of a
+`choice` type, or define methods or other members for it (although you can
+extend it to implement interfaces, using an
+[`external impl`](generics/overview.md#implementing-interfaces) or
+[`adapter`](generics/overview.md#adapting-types)). However, a `class` type can
+be extended to behave like a sum type. This is much more verbose than a `choice`
+declaration, but gives the author full control over the representation and class
+members.
+
+The ability to create instances of the sum type can be straightforwardly
+emulated with factory functions and static constants, and the internal storage
+layout will presumably involve untagged unions or some other low-level storage
+primitive which hasn't been designed yet, but the key to defining a sum type's
+interface is enabling it to support pattern matching. To do that, the sum type
+has to specify two things:
+
+-   The set of all possible alternatives, including their names and parameter
+    types, so that the compiler can typecheck the `match` body, identify any
+    unreachable `case`s, and determine whether any `case`s are missing.
+-   The algorithm that, given a value of the sum type, determines which
+    alternative is present, and specifies the values of its parameters.
+
+It does so by implementing the `Match` interface, which is defined as follows:
+
+```carbon
+interface Match {
+  interface BaseContinuation {
+    let ReturnType:! Type;
+  }
+
+  let template Continuation:! Type;
+  fn Op[me: Self, C:! Continuation](continuation: C*)
+    -> C.(MatchContinuation.ReturnType);
+}
+```
+
+`Continuation` must itself be an interface that extends
+`Match.BaseContinuation`, and its definition specifies the set of possible
+alternatives: each alternative is represented as a method of that interface.
+When compiling a proper pattern (or set of patterns that includes a proper
+pattern, as with the cases of a `match`) whose type is a sum type, the compiler
+generates an implementation of `Continuation` and passes it to `Match.Op`. The
+sum type's implementation of `Match.Op` is responsible for determining which
+alternative is present and what its parameters are, and calling the
+corresponding method of `continuation` with those parameters. The `Match.Op`
+implementation is required to call exactly one such method exactly once before
+returning. The compiler populates the `Continuation` method bodies with whatever
+code should be executed when the corresponding alternatives match.
+
+**TODO:** if Carbon has explicit support for tail calls, we should probably
+require that `Match.Op` invoke the continuation as a tail call.
+
+For example, here's how `Optional` can be defined as a class:
+
+```carbon
+class Optional(T:! Type) {
+  // Factory functions
+  fn Some(value: T) -> Self;
+  let None:! Self;
+
+  private var has_value: bool;
+  private var value: T;
+
+  external impl as Match {
+    interface Continuation {
+      extends Match.BaseContinuation;
+      fn Some[addr me: Self*](value: T) -> ReturnType;
+      fn None[addr me: Self*]() -> ReturnType;
+    }
+
+    fn Op[me: Self, C:! Continuation](continuation: C*) -> C.ReturnType {
+      if (me.has_value) {
+        return continuation->Some(me.value);
+      } else {
+        return continuation->None();
+      }
+    }
+  }
+
+  // Operations like destruction, copying, assignment, and comparison are
+  // omitted for brevity.
+}
+```
+
+And here's how the compiler-generated implementation of
+`Optional.(Match.Continuation)` for the `match` statement shown earlier might
+look, if it were written in Carbon:
+
+```carbon
+class __MatchStatementImpl {
+  impl as Match(Optional.MatchContinuation) where .ReturnType = () {
+    fn Some(the_value: i32) {
+      Print(the_value);
+    }
+    fn None() {
+      Print("None");
+    }
+  }
+}
+
+my_opt.(Match.Op)({} as __MatchStatementImpl);
+```
+
+(The name `__MatchStatementImpl` is a placeholder for illustration purposes; the
+actual generated class will be anonymous.)
+
+The mechanism described above for proper patterns may also be used for
+expression patterns if they have the form of an alternative pattern. An
+expression pattern of type `T` has the form of an alternative pattern if `T`
+implements `Match`, and the expression consists of an optional expression that
+names `T`, followed by a designator that names a method of
+`T.(Match.Continuation)`, optionally followed by a tuple expression. If an
+expression pattern has that form, it may be matched using the mechanism above,
+as if it were a proper pattern, rather than by evaluating the expression and
+comparing it to the scrutinee using `==`. Both possible implementations must be
+well-formed (and this is enforced by the compiler), but it is unspecified which
+implementation is used to generate code.
+
+As a result, it is **strongly** recommended that user-defined sum types ensure
+that for every alternative there is a factory function or constant member with
+the same name and parameter list, such that pattern-matching on the result will
+correctly reproduce the arguments to the factory function. For example, the
+definition of `Optional` above satisfies this requirement, because for any
+regular type `T`, the expression `Optional(T).None` evaluates to a value that
+matches the pattern `Optional(T).None` (under both possible matching
+mechanisms), and for any `x` of type `T`, the expression `Optional(T).Some(x)`
+evaluates to a value that matches the pattern `Optional(T).Some(y: T)` and binds
+`y` to a value that's equal to `x`. Expression patterns involving a sum type
+that doesn't meet this requirement will fail to compile, or have behavior that
+observably changes depending on the compiler's implementation choices.
+
+Another corollary of this rule is that if an alternative takes no arguments, its
+pattern syntax is the same as its expression syntax. For example,
+`case Optional(i32).None() => ...` is not well-formed, because
+`Optional(i32).None()` has the form of an alternative pattern, but the
+implementation in terms of `==` is not well-formed because
+`Optional(i32).None()` is not a well-formed expression. If we had defined
+`Optional.None` as a factory function instead of a constant,
+`case Optional(i32).None() => ...` would be well-formed but
+`case Optional(i32).None => ...` would not be.
+
+Note that the compiler-generated continuation method bodies are not required to
+contain the code in the `case` body (or whatever code is in the scope of the
+pattern). For example, they might only store the parameter values and then
+return an index that identifies the `case` body to be executed.
+
+## Alternatives considered
+
+-   [Providing `choice` types only](/proposals/p0157.md#choice-types-only), with
+    no support for user-defined sum types.
+-   [Indexing alternatives by type](/proposals/p0157.md#indexing-by-type)
+    instead of by name.
+-   Implementing user-defined sum types in terms of
+    [`choice` type proxies](/proposals/p0157.md#pattern-matching-proxies) rather
+    than callbacks.
+-   Implementing user-defined sum types in terms of invertible
+    [pattern functions](/proposals/p0157.md#pattern-functions).
+
+## References
+
+-   Proposal
+    [#157: Design direction for sum types](https://github.com/carbon-language/carbon-lang/pull/157)

+ 73 - 0
proposals/p2187.md

@@ -0,0 +1,73 @@
+# Update sum types design
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/2187)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Abstract](#abstract)
+-   [Problem](#problem)
+-   [Proposal](#proposal)
+-   [Alternatives considered](#alternatives-considered)
+    -   [Keep the continuation interface as a parameter](#keep-the-continuation-interface-as-a-parameter)
+    -   [Require the continuation to execute the whole `case` body](#require-the-continuation-to-execute-the-whole-case-body)
+
+<!-- tocstop -->
+
+## Abstract
+
+This proposal updates the design of [#157](p0157.md) to reflect subsequent
+evolution of the language.
+
+## Problem
+
+Some aspects of the design of #157 have become inconsistent with the rest of the
+language, or can be made more precise in light of subsequent language
+development.
+
+## Proposal
+
+-   Make the continuation interface an associated type rather than an interface
+    parameter.
+-   For patterns that can be matched using either `Match` or `==`, require both
+    implementations to be well-formed.
+-   Use `name: Type` order instead of `Type: name` order in alternative
+    declarations.
+-   Use current syntax and semantics for generics and class types.
+-   Rename `Matchable.Match` to `Match.Op`, following the resolution of
+    [#1058](https://github.com/carbon-language/carbon-lang/issues/1058).
+
+## Alternatives considered
+
+### Keep the continuation interface as a parameter
+
+Making the continuation interface a parameter of `Match` could in principle
+allow a single type to support pattern matching in multiple ways, by
+implementing `Match` for multiple continuation interfaces. However, that would
+require something like overload resolution on interfaces, to choose the
+implementation of `Match(C)` on the sum type for which `C` best matches the
+continuation constructed by the compiler. No such overloading mechanism is
+planned, and we don't have sufficiently compelling use cases to motivate it.
+
+### Require the continuation to execute the whole `case` body
+
+We will probably want to support in-place mutation of alternative parameters
+(for example so you can call a mutable method on the value stored in an
+`Optional(Foo)`), and we might even want to extend that to cases where the
+underlying parameter isn't represented as an lvalue of that type, but has to be
+unpacked by `Match.Op`. The only way I see to make that work is to have
+`Match.Op` unpack the parameter to a local lvalue, pass it to the continuation,
+and then pack the possibly-mutated value back into the sum object after the
+continuation returns. That would mean the compiler has to execute the case body
+inside the continuation, not after `Match.Op` returns.
+
+However, that's fairly speculative, and wouldn't apply to the read-only cases
+that we currently support, so we need not preemptively constrain the compiler
+here.