This proposal re-affirms (with additional rationale) that a var pattern
declares a durable complete object, and refines the terminology for binding
patterns in a var pattern to be more explicit about the intended semantics. It
also makes several other changes and clarifications to the semantics of pattern
matching on objects:
match statement are
materialized before matching the cases.Finally, as a drive-by fix, it clarifies what parts of the match design are
still placeholders.
Discussions arising from the implementation of var patterns have surfaced some
problems with how pattern matching deals with objects:
var object,
destructively moving them will lead to double-destruction.if-guards much less useful.match statement's scrutinee is an
initializing expression, since the result of an initializing expression
can't be reused.The first point deserves some elaboration. We don't yet have a concrete proposal
for move semantics, but it's possible to discern the overall direction well
enough to see a problem with how it interacts with var. The following sketch
of move semantics should be considered a placeholder, not an approved
design.
~x is a move from the reference expression x. It is an initializing
expression that initializes an object with the value that x held prior to the
move, while arbitrarily mutating x to make the move more efficient. By default
~x is destructive, meaning that it ends the lifetime of x; under some
conditions it may instead leave x in an
unformed state, or make a copy of x,
but only if the type supports doing so.
Consider the following code, where X and Y are types that are movable but
not copyable, have nontrivial destructors, and do not have unformed states:
fn A() -> (X, Y);
fn B(var x: X);
fn F() {
var (x: X, y: Y) = A();
B(~x);
}
Under the current design of pattern matching, the first line of F declares a
complete object of type (X, Y), and binds x and y to its elements. At the
end of the body of F, that tuple object is still live, so its destructor will
run, which will recursively run the destructors for its elements. However, its
first element was already destroyed by ~x, so this would result in
double-destruction of that sub-object.
In order to avoid that problem, the expression ~x must be ill-formed. More
generally, if a var pattern contains a tuple or struct subpattern, the
bindings it declares cannot be moved from (unless, possibly, their types permit
them to be non-destructively moved, and/or safely destroyed twice). In order to
be well-formed, the first line of F must be rewritten as:
let (var x: X, var y: Y) = A();
This makes the code more verbose, and may be surprising to users.
The decision that var declares a single complete object is reaffirmed,
notwithstanding the problem described above. To be more explicit about the
intended meaning, we will adjust the terminology:
var pattern.var pattern. Thus, every variable binding
pattern is a reference binding pattern, but not vice-versa.To address the other problems:
match statement are
materialized before matching the cases.See leads issue #5250 for further discussion of this problem.
var as variable bindingsWe could treat all bindings under a var pattern as variable bindings, instead
of limiting that to the case of a single binding pattern immediately under the
var pattern. This would mean that all such bindings declare complete objects,
and hence are movable. By the same token, it would mean that a var pattern
does not declare a complete object.
However, we are likely to eventually need a way for a pattern to declare a
complete object and bind names to its parts, for example to support Rust-style
@-bindings
or to support in-place destructuring of user-defined types (where destroying the
object in order to make its subojects movable could have unwanted side effects).
This approach would make it difficult to do that with good ergonomics, because
both var and the hypothetical complete-object pattern syntax would change the
meanings of nested bindings, and the behavior of destructuring, in conflicting
ways.
This approach also expresses the programmer's intent less clearly, which could
harm readability, because a var with several nested bindings could be intended
to make those bindings movable, or just to make them mutable.
That also makes this option less future-proof. If we start with this approach and later migrate to the status quo because we need the extra expressive power, we would need to somehow infer the missing intent information. Conversely, if we start with the status quo and later conclude we aren't going to make its extra expressive power observable, existing valid code will remain valid, with the same behavior, after switching to this approach.
The only major advantage of this approach over the status quo is that it's less verbose when declaring multiple movable bindings, and may be less surprising to users because it's less restrictive by default. However, those advantages aren't significant enough to offset those costs.
var a binding pattern modifierThe current design makes binding patterns fairly context-sensitive, which we
generally try to avoid: a
binding pattern declares a variable binding pattern if it's the immediate child
of a var pattern, a non-variable reference binding pattern if it's an indirect
descendant of a var pattern, and a value binding pattern otherwise.
We could avoid context-sensitivity by making var a modifier on binding
patterns, like template. That would mean there are no reference binding
patterns, and the distinction between variable and value binding patterns is
always purely local.
However, allowing var to apply to many bindings at once improves the
ergonomics of destructuring into variables, which is likely to be a common use
case. Furthermore, the offsetting costs of allowing that are fairly minimal:
Prior to this proposal, the status quo was that the storage associated with a
pattern is not initialized until we know that the complete pattern matches. This
helps avoid situations where a case that does not match nevertheless has
visible side effects. However, this approach has several major drawbacks:
var x: X = F(); do not require any temporary storage,
because it would imply that F() is not even evaluated until we know the
pattern matches. That's feasible for irrefutable patterns like this one, but
not for refutable patterns. Even if we were willing to limit that guarantee
to contexts that require irrefutable patterns, it would complicate the
implementation, because the underlying logic would have major structural
differences in the two cases.if-guard cannot use variable
bindings from the pattern it guards. Practically all the motivating use
cases for if-guards involve using bindings from the guarded pattern, so
this is tantamount to making var and if mutually exclusive.casesConsider the following code:
fn F() -> X;
fn G() -> i32;
match ((F(), G())) {
case (var x: X, 0) => { ... }
case (var x: X, 1) => { ... }
case (var x: X, 2) => { ... }
case (var x: X, 3) => { ... }
...
}
Under this proposal, the result of F() is materialized in temporary storage,
and then copied into the storage for x as part of matching each case (it
can't be moved, because it must remain available for the next case). As a
result, this code may make as many copies of X as there are cases, and
doesn't even compile if X isn't copyable.
We could instead treat those var x: X declarations as aliases for the
materialized temporary. However, in order to generalize that approach we would
need to answer questions like:
case (var x: X, 0) and case var (x: X, 1)?cases are separated by another case
that doesn't have a binding in that position?Furthermore, the behavior of these rules would need to be intuitive and unsurprising, because any change that prevents aliasing (in even one case) could break the build or cause surprising performance changes.
Even in the best case, this approach is inherently limited. If the bindings have different types, at most one of them can be an alias for the scrutinee; the other must be initialized by a copying conversion.
This approach would open up the possibility of a non-taken case mutating the
state of the scrutinee, by invoking a mutating operation on a variable binding
in an expression pattern or an if-guard. We may be able to statically forbid
such mutations, but that will be easier to assess once we understand the role of
mutability in our broader safety story.
We conjecture that in most if not all cases where aliasing would be desirable, it will be straightforward to rewrite the code to avoid the problem. For example, the example above can be rewritten as:
fn F() -> X;
fn G() -> i32;
var x: X = F();
match (G()) {
case 0 => { ... }
case 1 => { ... }
case 2 => { ... }
case 3 => { ... }
...
}
We can revisit this alternative if that conjecture turns out to be incorrect in practice.