We would like to address the use cases for inheritance described in proposal #561: Basic classes: use cases, struct literals, struct types, and future work, including providing a migration path for C++ types and programmers currently using inheritance.
This is a follow up to these previous proposals defining classes:
The proposal is to update docs/design/classes.md as described in this PR.
This particular proposal is focusing on these Carbon goals:
This is a divergence from C++, but has a number of benefits:
X may assume that any value of
type X* should be treated as a pointer to exactly that type.base class puts
information about the class important to readers up front, rather than
leaving them wondering whether the class supports inheritance or just
accidentally used the default.In both of these cases, we decided it was better that there was only one way to write the code, than allow a keyword to be written in a situation where it only acted as a comment without changing the meaning of the code.
final classWe considered allowing final class as a synonym for class without a base
prefix, but we didn't feel it would provide benefit justifying the additional
complexity.
partial FinalClassWe considered allowing partial to be used for all constructor functions. For a
final class, partial FinalClass would be an alias for FinalClass. FIXME
Answer: No
Instead of virtual we considered base. This would create a parallel
structure between abstract and base classes on one hand, and abstract and
base methods on the other. However, we felt like this was an important case to
maintain continuity with C++.
Instead of abstract we considered:
virtual ... = 0requiredpure virtualvirtual ... pureWe didn't like using a suffix like = 0 or = pure, since it is in place of an
implementation but we wouldn't put it out of line like an implementation. We
didn't like = 0 despite it being consistent with C++ because it didn't reflect
the meaning in the way a keyword could, and keywords are easier to look up in
search engines. We might reconsider required if we decide that we want to use
that keyword in other places, such as in a mixin. In the end, we went with
abstract since it is used in other languages, such as Java, and could stand on
its own without having to be paired with virtual.
Instead of impl we considered using override as done in C++, with the
difference that the keyword would be mandatory in Carbon. There were a few
concerns with using override:
The choice of impl is intended to draw a parallel with implementing
interfaces.
If we went with override, we might change the other keywords to match, using
must_override instead of abstract and overridable instead of virtual. We
might consider switching to overridable if we decide that is a keyword we
would use in other contexts that allow overriding without using runtime dispatch
using a virtual table, for example interfaces or mixins.
We considered putting the virtual override keyword after the function's signature:
base class MyBaseClass {
fn Overridable[me: Self]() -> i32 virtual { return 7; }
}
Rationale for putting the keyword to the right:
Unless you are extending the class, callers would not notice replacing a virtual function with a non-virtual function calling a private virtual function.
The concern was that while this choice makes the API easier to read for users calling methods on the base class, it makes it significantly harder to read for users extending the base class. And extending the base class was a common enough and important enough use case that this change was not worth also trading off familiarity from C++.
Reference: This was decided in issue #754: Placement of the word virtual.
We considered allowing final to be used as a virtual override keyword, to mark
non-overridable methods.
This is something we might change in the future, based on demonstrated need, but
for now we didn't see the use cases for it occurring in practice that would
justify its addition to the language. This was based on a few reasons.
Note that if we were to add final methods, they would be allowed to be implemented in the partial facet type of a base class.
Perils of Constructors
gives a great overview of the challenges with constructors. It expresses the
advantages of the factory function approach used by Rust, but observes that
there are some difficulties making it work with inheritance and placement.
Proposal
#257: Initialization of memory and variables
addresses the placement component of construction, and this proposal extends
that approach to work with inheritance using the partial facet. This approach
has some benefits:
We considered several alternatives, particularly in issue #741: Constructing an object of a derived type.
partialIn issue #741, we
considered other keywords instead of partial to designate the facet of the
type for construction.
base: Intended to indicate a
base class subobject,
but was confusing with other uses of the word "base" to mean "the base class
of a type."as_base: Intended to address the confusion around using base by adding a
preposition that indicates this isn't the "base of the type." However, it
introduces confusion with the as operator used to cast.bare: Too far from the intended meaning.impl: This keyword is already being used for other things that are too
different from this use.novirt: Describes something about the effect of this keyword, but not why
you are using it.exact: Intended to suggest this is the is a use of the exact type, not a
possibly derived type. This, like novirt, was too focused on the effect of
the keyword and wasn't suggestive enough of why it was being used. Also this
didn't capture why this keyword would allow you to instantiate an abstract
base class.ctor, construct, constructor: These were the wrong part of speech. The
type is not the constructor, the function returning this type is.under_construction: Too long.For the construction-related options, there were also concerns that we might also use this type during destruction of an object.
In issue #741, we
considered recommending using the partial facet in constructors of extensible
classes more strongly than the current proposal. Ultimately we decided it wasn't
necessary:
partial are small enough, matching C++
instead of a possible improvement.partial was too subtle and hard to explain.partial type because you declared a variable with type
auto seemed like a bad user experience.protected constructor returning a partial type for
descendants and a public constructor returning the full type improved the
ergonomics for using the class but seemed like painful boilerplate for
authors of the class.In issue #741, we considered making the constructor of a derived class explicitly set the fields of the base class without delegating to the base constructor, avoiding the problem of trying to instantiate a base class that might be abstract.
It had some clear disadvantages including:
In issue #741, we considered allowing instantiating abstract base classes so they could be used to initialize derived classes, but this was a safety regression from C++.
In issue #741, we considered splitting inheritance into separate mechanisms for subtyping and implementation reuse. Interfaces would represent APIs for subtyping purposes and implementations would be defined in mixins. This was a major divergence from C++ and would likely cause problems for both programmers and interoperation.
Swift initialization
requires the user to define a special init method that initializes the fields
of the object before calling any methods on it. For a
derived class,
this is then followed by a call to the base's init method. After that is done,
there is a
second phase of initialization
that can operate on an object that at least has all fields initialized. This
means method calls are allowed, even though it is possible that not all
invariants of the class have been established for the object.
class MyClass extends BaseClass {
fn init(...) {
me.derived_field = ...;
super.init(base_arguments);
phase_2_after_fields_are_set();
}
}
This approach has some nice properties, for example it supports calling virtual
methods in the base class' init method and getting the derived implementation.
However it has some disadvantages for our purposes:
C# Constructors
have names that match their class, and the constructor of a derived class starts
with a call to the base class' constructor using this : base(...) syntax
between the parameter list and function body:
class MyClass extends BaseClass {
fn MyClass(...) : base(base_arguments) {
me.derived_field = ...;
phase_2_after_fields_are_set();
}
}
Alternatively, a constructor can delegate to another constructor using
: this(...) syntax instead.
Disadvantages for our purposes:
We rejected prior Carbon proposal #98, where the user's initialization function called a compiler-provided function to create the object once the base constructor arguments and derived field values were known.
class MyClass extends BaseClass {
fn operator create(...) -> Self {
...
returned var result: Self =
construct(base_arguments, {.derived_field = ...});
phase_2_after_fields_are_set();
return result;
}
}
This avoids giving a name to the object being constructed until its fields have been initialized, without relying on static analysis, making it clearer what destructors should run in the case of failure, though the current proposal is still clearer.
Disadvantages for our purposes:
Note that this prevents using the values assigned to the fields in the base's
constructor to determine the initial values for the derived fields. We could
address this concern by splitting the special construct function into two
pieces:
class MyClass extends BaseClass {
fn operator create(...) -> Self* {
...
var parent: BaseClass* = create_base(base_arguments);
// Can determine the values for derived fields here.
var result: Self* = construct(parent, {.derived_field = ...});
phase_2_after_fields_are_set();
return result;
}
}
This adds some complexity, but interoperates better with C++.
We considered following C++'s approach of making classes abstract based on having any pure virtual methods. This leads to awkward workarounds where you might mark a class' destructor as pure virtual even though it is still implemented. We decided to use a different introducer for abstract classes since this is very important for readers, helping them determine the role of the class and whether this is the class they are looking for.
We thought that if you were to change a class to being abstract, you would likely also update its description and rename it at the same time, since that was such an important change to the interface of the class.
We considered forbidding constructing extensible objects with non-virtual destructors. This was to avoid getting into a state where a type could be used as a local variable but not allocated on the heap. It was also identified as an advanced use case that didn't need to be as convenient to write, and so the overhead of using both a final and an abstract type in place of an extensible type might be more acceptable and would give much more clarity to what a given type represented.
However, this was a noticeable divergence from C++ where extensible objects are
the default. We decided that consistency with both C++ and extensible classes
with virtual methods was more valuable. The error when deleting a base class
with a non-virtual destructor would be very clear and offer useful alternatives:
making the destructor virtual, making a final class, or using unsafe_delete.
This matched the idea that
Carbon should "focus on encouraging appropriate usage of features rather than restricting misuse".
This topic was discussed in issue #652: Extensible classes with or without vtables and on Discord.
Issue #652 considered many variations on ways to have two different types for values depending on whether they represented a value with an exact type, or a value that could be a derived type. We ultimately decided that asking users to use both types would be too much cognitive overhead, and would be a usability regression from C++.
Issue #652 considered instead having two kinds of pointers. One would point to a value of a specific known type, and the other would point to a value of some derived type. This has two disadvantages compared to having the variations be on the types of the values.