|
|
3 年 前 | |
|---|---|---|
| .. | ||
| code_and_name_organization | 4 年 前 | |
| control_flow | 3 年 前 | |
| expressions | 4 年 前 | |
| generics | 4 年 前 | |
| interoperability | 4 年 前 | |
| lexical_conventions | 3 年 前 | |
| README.md | 3 年 前 | |
| aliases.md | 5 年 前 | |
| blocks_and_statements.md | 5 年 前 | |
| classes.md | 3 年 前 | |
| functions.md | 4 年 前 | |
| metaprogramming.md | 5 年 前 | |
| name_lookup.md | 4 年 前 | |
| naming_conventions.md | 4 年 前 | |
| pattern_matching.md | 3 年 前 | |
| primitive_types.md | 4 年 前 | |
| templates.md | 4 年 前 | |
| tuples.md | 4 年 前 | |
| type_inference.md | 4 年 前 | |
| variables.md | 4 年 前 | |
This documentation describes the design of the Carbon language, and the rationale for that design. This documentation is an overview of the Carbon project in its current state, written for the builders of Carbon and for those interested in learning more about Carbon.
This document is not a complete programming manual, and, nor does it provide detailed and comprehensive justification for design decisions. These descriptions are found in linked dedicated designs.
This document includes much that is provisional or placeholder. This means that the syntax used, language rules, standard library, and other aspects of the design have things that have not been decided through the Carbon process. This preliminary material fills in gaps until aspects of the design can be filled in.
Here is a simple function showing some Carbon code:
import Console;
// Prints the Fibonacci numbers less than `limit`.
fn Fibonacci(limit: i64) {
var (a: i64, b: i64) = (0, 1);
while (a < limit) {
Console.Print(a, " ");
let next: i64 = a + b;
a = b;
b = next;
}
Console.Print("\n");
}
Carbon is a language that should feel familiar to C++ and C developers. This example has familiar constructs like imports, function definitions, typed arguments, and curly braces.
A few other features that are unlike C or C++ may stand out. First, declarations
start with introducer keywords. fn introduces a function declaration, and
var introduces a variable declaration. You can also see a tuple, a composite
type written as a comma-separated list inside parentheses. Unlike, say, Python,
these types are strongly-typed as well.
All source code is UTF-8 encoded text. Comments, identifiers, and strings are allowed to have non-ASCII characters.
var résultat: String = "Succès";
Comments start with two slashes // and go to the end of the line. They are
required to be the only non-whitespace on the line.
// Compute an approximation of π
References:
- Source files
- lexical conventions
- Proposal #142: Unicode source files
- Proposal #198: Comments
The behavior of the Carbon compiler depends on the build mode:
References: Safety strategy
Expressions compute values in Carbon, and these values are always strongly typed much like in C++. However, an important difference from C++ is that types are themselves modeled as values; specifically, compile-time constant values. This means that the grammar for writing a type is the expression grammar. Expressions written where a type is expected must be able to be evaluated at compile-time and must evaluate to a type value.
Primitive types fall into the following categories:
bool,These are made available through the prelude.
References: Primitive types
boolThe type bool is a boolean type with two possible values: true and false.
Comparison expressions produce bool values. The condition
arguments in control-flow statements, like if
and while, and
if-then-else conditional expressions take bool values.
The signed-integer type with bit width N may be written Carbon.Int(N). For
convenience and brevity, the common power-of-two sizes may be written with an
i followed by the size: i8, i16, i32, i64, i128, or i256.
Signed-integer
overflow is a
programming error:
The unsigned-integer types are: u8, u16, u32, u64, u128, u256, and
Carbon.UInt(N). Unsigned integer types wrap around on overflow, we strongly
advise that they are not used except when those semantics are desired. These
types are intended for bit manipulation or modular arithmetic as often found in
hashing,
cryptography, and
PRNG use cases.
Values which can never be negative, like sizes, but for which wrapping does not
make sense
should use signed integer types.
References:
- Question-for-leads issue #543: pick names for fixed-size integer types
- Proposal #820: Implicit conversions
- Proposal #1083: Arithmetic expressions
Integers may be written in decimal, hexadecimal, or binary:
12345 (decimal)0x1FE (hexadecimal)0b1010 (binary)Underscores _ may be used as digit separators, but for decimal and hexadecimal
literals, they can only appear in conventional locations. Numeric literals are
case-sensitive: 0x, 0b must be lowercase, whereas hexadecimal digits must be
uppercase. Integer literals never contain a ..
Unlike in C++, literals do not have a suffix to indicate their type. Instead, numeric literals have a type derived from their value, and can be implicitly converted to any type that can represent that value.
References:
- Integer literals
- Proposal #143: Numeric literals
- Proposal #144: Numeric literal semantics
- Proposal #820: Implicit conversions
Floating-point types in Carbon have IEEE 754 semantics, use the round-to-nearest
rounding mode, and do not set any floating-point exception state. They are named
with an f and the number of bits: f16, f32, f64, and f128.
BFloat16 is also provided.
References:
- Question-for-leads issue #543: pick names for fixed-size integer types
- Proposal #820: Implicit conversions
- Proposal #1083: Arithmetic expressions
Floating-point types along with user-defined types may initialized from real-number literals. Decimal and hexadecimal real-number literals are supported:
123.456 (digits on both sides of the .)123.456e789 (optional + or - after the e)0x1.Ap123 (optional + or - after the p)Real-number literals always have a period (.) and a digit on each side of the
period. When a real-number literal is interpreted as a value of a floating-point
type, its value is the representable real number closest to the value of the
literal. In the case of a tie, the nearest value whose mantissa is even is
selected.
References:
- Real-number literals
- Proposal #143: Numeric literals
- Proposal #144: Numeric literal semantics
- Proposal #820: Implicit conversions
- Proposal #866: Allow ties in floating literals
There are two string types:
String - a byte sequence treated as containing UTF-8 encoded text.StringView - a read-only reference to a byte sequence treated as
containing UTF-8 encoded text.String literals may be written on a single line using a double quotation mark
(") at the beginning and end of the string, as in "example".
Multi-line string literals, called block string literals, begin and end with
three double quotation marks ("""), and may have a file type indicator after
the first """.
// Block string literal:
var block: String = """
The winds grow high; so do your stomachs, lords.
How irksome is this music to my heart!
When such strings jar, what hope of harmony?
I pray, my lords, let me compound this strife.
-- History of Henry VI, Part II, Act II, Scene 1, W. Shakespeare
""";
The indentation of a block string literal's terminating line is removed from all preceding lines.
Strings may contain
escape sequences
introduced with a backslash (\).
Raw string literals
are available for representing strings with \s and "s.
References:
- String literals
- Proposal #199: String literals
A tuple is a fixed-size collection of values that can have different types, where each value is identified by its position in the tuple. An example use of tuples is to return multiple values from a function:
fn DoubleBoth(x: i32, y: i32) -> (i32, i32) {
return (2 * x, 2 * y);
}
Breaking this example apart:
i32 types.i32 values.Both of these are expressions using the tuple syntax
(<expression>, <expression>). The only difference is the type of the tuple
expression: one is a tuple of types, the other a tuple of values. In other
words, a tuple type is a tuple of types.
The components of a tuple are accessed positionally, so element access uses subscript syntax, but the index must be a compile-time constant:
fn DoubleTuple(x: (i32, i32)) -> (i32, i32) {
return (2 * x[0], 2 * x[1]);
}
Tuple types are structural.
References: Tuples
Carbon also has structural types whose members are identified by name instead of position. These are called structural data classes, also known as a struct types or structs.
Both struct types and values are written inside curly braces ({...}). In
both cases, they have a comma-separated list of members that start with a period
(.) followed by the field name.
:) and the type,
as in: {.name: String, .count: i32}.=) and the value,
as in {.key = "Joe", .count = 3}.References:
The type of pointers-to-values-of-type-T is written T*. Carbon pointers do
not support
pointer arithmetic;
the only pointer operations are:
p, *p gives the value p points to as an
l-value.
p->m is syntactic sugar for (*p).m.x, &x returns a pointer to x.There are no null pointers in
Carbon. To represent a pointer that may not refer to a valid object, use the
type Optional(T*).
Pointers are the main Carbon mechanism for allowing a function to modify a variable of the caller.
References:
- Question-for-leads issue #520: should we use whitespace-sensitive operator fixity?
- Question-for-leads issue #523: what syntax should we use for pointer types?
The type of an array of holding 4 i32 values is written [i32; 4]. There is
an implicit conversion from tuples to
arrays of the same length as long as every component of the tuple may be
implicitly converted to the destination element type. In cases where the size of
the array may be deduced, it may be omitted, as in:
var i: i32 = 1;
// `[i32;]` equivalent to `[i32; 3]` here.
var a: [i32;] = (i, i, i);
Elements of an array may be accessed using square brackets ([...]), as in
a[i]:
a[i] = 2;
Console.Print(a[0]);
TODO: Slices
Expressions describe some computed value. The simplest example would be a
literal number like 42: an expression that computes the integer value 42.
Some common expressions in Carbon include:
Literals:
Names and member access
-x, 1 + 2, 3 - 4,
2 * 5, 6 / 3, 5 % 32 & 3, 2 | 4, 3 ^ 1, ^71 << 3, 8 >> 12 == 2, 3 != 4,
5 < 6, 7 > 6, 8 <= 8, 8 >= 82 as i32a and b, c or d,
not ea[3]f(4)*p, p->m, &xConditionals: if c then t else f
Parentheses: (7 + 8) * (3 - 1)
When an expression appears in a context in which an expression of a specific type is expected, implicit conversions are applied to convert the expression to the target type.
References:
- Expressions
- Proposal #162: Basic Syntax
- Proposal #555: Operator precedence
- Proposal #601: Operator tokens
- Proposal #680: And, or, not
- Proposal #702: Comparison operators
- Proposal #845: as expressions
- Proposal #911: Conditional expressions
- Proposal #1083: Arithmetic expressions
Declarations introduce a new name and say what that name represents. For some kinds of entities, like functions, there are two kinds of declarations: forward declarations and definitions. In this case, there should be exactly one definition for the name, but there can be additional forward declarations that introduce the name before it is defined. Forward declarations allow cyclic references, and can be used to declare a name in an api file that is defined in an impl file. A name that has been declared but not defined is called incomplete, and in some cases there are limitations on what can be done with an incomplete name.
A name is valid until the end of the innermost enclosing
scope. Except for
the outermost scope, scopes are enclosed in curly braces ({...}).
A pattern says how to receive some data that is being matched against. There are two kinds of patterns:
Irrefutable patterns are used in function parameters,
variable var declarations, and
constant let declarations.
match statements can include both refutable patterns and irrefutable
patterns.
References:
- Pattern matching
- Proposal #162: Basic Syntax
The most common irrefutable pattern is a binding pattern, consisting of a new
name, a colon (:), and a type. It binds the matched value of that type to that
name. It can only match values that may be
implicitly converted to that type. A
underscore (_) may be used instead of the name to match a value but without
binding any name to it.
Binding patterns default to let bindings except inside a context where the
var keyword is used to make it a var binding:
let binding is the name is bound to an
non-l-value.
This means the value can not be modified, and its address cannot be taken.var binding has dedicated storage, and so the name is an
l-value
which can be modified and has a stable address.A generic binding uses :! instead of a
colon (:) and can only match compile-time values.
The keyword auto may be used in place of the type in a binding pattern, as
long as the type can be deduced from the type of a value in the same
declaration.
There are also irrefutable destructuring patterns, such as tuple destructuring. A tuple destructuring pattern looks like a tuple of patterns. It may only be used to match tuple values whose components match the component patterns of the tuple. An example use is:
// `Bar()` returns a tuple consisting of an
// `i32` value and 2-tuple of `f32` values.
fn Bar() -> (i32, (f32, f32));
fn Foo() -> i64 {
// Pattern in `var` declaration:
var (p: i64, _: auto) = Bar();
return p;
}
The pattern used in the var declaration destructures the tuple value returned
by Bar(). The first component pattern, p: i64, corresponds to the first
component of the value returned by Bar(), which has type i32. This is
allowed since there is an implicit conversion from i32 to i64. The result of
this conversion is assigned to the name p. The second component pattern,
_: auto, matches the second component of the value returned by Bar(), which
has type (f32, f32).
Additional kinds of patterns are allowed in match statements, that
may or may not match based on the runtime value of the match expression:
42, whose value must be
equal to match.See match for examples of refutable patterns.
References:
- Pattern matching
- Question-for-leads issue #1283: how should pattern matching and implicit conversion interact?
There are two kinds of name-binding declarations:
let, andvar.There are no forward declarations of these; all name-binding declarations are definitions.
let declarationsA let declaration matches an irrefutable pattern to a value. In
this example, the name x is bound to the value 42 with type i64:
let x: i64 = 42;
Here x: i64 is the pattern, which is followed by an equal sign (=) and the
value to match, 42. The names from binding patterns are
introduced into the enclosing scope.
var declarationsA var declaration is similar, except with var bindings, so x here is an
l-value with
storage and an address, and so may be modified:
var x: i64 = 42;
x = 7;
Variables with a type that has an unformed state do not need to be initialized in the variable declaration, but do need to be assigned before they are used.
References:
- Variables
- Proposal #162: Basic Syntax
- Proposal #257: Initialization of memory and variables
- Proposal #339: Add
var <type> <identifier> [ = <value> ];syntax for variables- Proposal #618: var ordering
autoIf auto is used as the type in a var or let declaration, the type is the
static type of the initializer expression, which is required.
var x: i64 = 2;
// The type of `y` is inferred to be `i64`.
let y: auto = x + 3;
// The type of `z` is inferred to be `bool`.
var z: auto = (y > 1);
References:
Functions are the core unit of behavior. For example, this is a forward declaration of a function that adds two 64-bit integers:
fn Add(a: i64, b: i64) -> i64;
Breaking this apart:
fn is the keyword used to introduce a function.Add. This is the name added to the enclosing
scope.(...)) is a comma-separated list of
irrefutable patterns.i64 result. Functions that return nothing omit the -> and
return type.You would call this function like Add(1, 2).
A function definition is a function declaration that has a body block instead of a semicolon:
fn Add(a: i64, b: i64) -> i64 {
return a + b;
}
The names of the parameters are in scope until the end of the definition or declaration.
The bindings in the parameter list default to
let bindings, and so the parameter names are treated as
r-values. If
the var keyword is added before the binding, then the arguments will be copied
to new storage, and so can be mutated in the function body. The copy ensures
that any mutations will not be visible to the caller.
References:
- Functions
- Proposal #162: Basic Syntax
- Proposal #438: Add statement syntax for function declarations
- Question-for-leads issue #476: Optional argument names (unused arguments)
auto return typeIf auto is used in place of the return type, the return type of the function
is inferred from the function body. It is set to common type of
the static type of arguments to the return statements in the
function. This is not allowed in a forward declaration.
// Return type is inferred to be `bool`, the type of `a > 0`.
fn Positive(a: i64) -> auto {
return a > 0;
}
References:
A block is a sequence of statements. A block defines a
scope and, like other scopes, is
enclosed in curly braces ({...}). Each statement is terminated by a
semicolon or block. Expressions and
var and let are
valid statements.
Statements within a block are normally executed in the order the appear in the source code, except when modified by control-flow statements.
The body of a function is defined by a block, and some
control-flow statements have their own blocks of code. These
are nested within the enclosing scope. For example, here is a function
definition with a block of statements defining the body of the function, and a
nested block as part of a while statement:
fn Foo() {
Bar();
while (Baz()) {
Quux();
}
}
References:
- Blocks and statements
- Proposal #162: Basic Syntax
Assignment statements mutate the value of the l-value described on the left-hand side of the assignment.
x = y;. x is assigned the value of y.++i;, --j;. i is set to i + 1, j is set
to j - 1.x += y;, x -= y;, x *= y;, x /= y;, x &= y;,
x |= y;, x ^= y;, x <<= y;, x >>= y;. x @= y; is equivalent to
x = x @ y; for each operator @.Unlike C++, these assignments are statements, not expressions, and don't return a value.
Blocks of statements are generally executed sequentially. Control-flow statements give additional control over the flow of execution and which statements are executed.
Some control-flow statements include blocks. Those
blocks will always be within curly braces {...}.
// Curly braces { ... } are required.
if (condition) {
ExecutedWhenTrue();
} else {
ExecutedWhenFalse();
}
This is unlike C++, which allows control-flow constructs to omit curly braces around a single statement.
References:
- Control flow
- Proposal #162: Basic Syntax
- Proposal #623: Require braces
if and elseif and else provide conditional execution of statements. An if statement
consists of:
if introducer followed by a condition in parentheses. If the condition
evaluates to true, the block following the condition is executed,
otherwise it is skipped.else if clauses, whose conditions are
evaluated if all prior conditions evaluate to false, with a block that is
executed if that evaluation is to true.else clause, with a block that is executed if all
conditions evaluate to false.For example:
if (fruit.IsYellow()) {
Console.Print("Banana!");
} else if (fruit.IsOrange()) {
Console.Print("Orange!");
} else {
Console.Print("Vegetable!");
}
This code will:
Banana! if fruit.IsYellow() is true.Orange! if fruit.IsYellow() is false and fruit.IsOrange() is
true.Vegetable! if both of the above return false.References:
- Control flow
- Proposal #285: if/else
References: Loops
whilewhile statements loop for as long as the passed expression returns true. For
example, this prints 0, 1, 2, then Done!:
var x: i32 = 0;
while (x < 3) {
Console.Print(x);
++x;
}
Console.Print("Done!");
References:
forfor statements support range-based looping, typically over containers. For
example, this prints all names in names:
for (var name: String in names) {
Console.Print(name);
}
This prints each String value in names.
References:
forloops- Proposal #353: Add C++-like
forloops
breakThe break statement immediately ends a while or for loop. Execution will
continue starting from the end of the loop's scope. For example, this processes
steps until a manual step is hit (if no manual step is hit, all steps are
processed):
for (var step: Step in steps) {
if (step.IsManual()) {
Console.Print("Reached manual step!");
break;
}
step.Process();
}
References:
break
continueThe continue statement immediately goes to the next loop of a while or
for. In a while, execution continues with the while expression. For
example, this prints all non-empty lines of a file, using continue to skip
empty lines:
var f: File = OpenFile(path);
while (!f.EOF()) {
var line: String = f.ReadLine();
if (line.IsEmpty()) {
continue;
}
Console.Print(line);
}
References:
continue
returnThe return statement ends the flow of execution within a function, returning
execution to the caller.
// Prints the integers 1 .. `n` and then
// returns to the caller.
fn PrintFirstN(n: i32) {
var i: i32 = 0;
while (true) {
i += 1;
if (i > n) {
// None of the rest of the function is
// executed after a `return`.
return;
}
Console.Print(i);
}
}
If the function returns a value to the caller, that value is provided by an expression in the return statement. For example:
fn Sign(i: i32) -> i32 {
if (i > 0) {
return 1;
}
if (i < 0) {
return -1;
}
return 0;
}
Assert(Sign(-3) == -1);
References:
returnreturnstatements- Proposal #415: return
- Proposal #538: return with no argument
returned varTo avoid a copy when returning a variable, add a returned prefix to the
variable's declaration and use return var instead of returning an expression,
as in:
fn MakeCircle(radius: i32) -> Circle {
returned var c: Circle;
c.radius = radius;
// `return c` would be invalid because `returned` is in use.
return var;
}
This is instead of the "named return value optimization" of C++.
References:
matchmatch is a control flow similar to switch of C/C++ and mirrors similar
constructs in other languages, such as Swift. The match keyword is followed by
an expression in parentheses, whose value is matched against the case
declarations, each of which contains a refutable pattern,
in order. The refutable pattern may optionally be followed by an if
expression, which may use the names from bindings in the pattern.
The code for the first matching case is executed. An optional default block
may be placed after the case declarations, it will be executed if none of the
case declarations match.
An example match is:
fn Bar() -> (i32, (f32, f32));
fn Foo() -> f32 {
match (Bar()) {
case (42, (x: f32, y: f32)) => {
return x - y;
}
case (p: i32, (x: f32, _: f32)) if (p < 13) => {
return p * x;
}
case (p: i32, _: auto) if (p > 3) => {
return p * Pi;
}
default => {
return Pi;
}
}
}
References:
- Pattern matching
- Question-for-leads issue #1283: how should pattern matching and implicit conversion interact?
TODO: Maybe rename to "nominal types"?
Nominal classes, or just classes, are a way for users to define their own data strutures or record types.
This is an example of a class definition:
class Widget {
var x: i32;
var y: i32;
var payload: String;
}
Breaking this apart:
Widget. Widget is the name added to the
enclosing scope.Widget is followed by curly braces ({...}) containing the
class body, making this a
definition. A
forward declaration would instead
have a semicolon(;).var declarations. Widget has
two i32 fields (x and y), and one String field (payload).The order of the field declarations determines the fields' memory-layout order.
Classes may have other kinds of members beyond fields declared in its scope:
aliaslet to define class constants. TODO:
Another syntax to define constants associated with the class like
class let or static let?class, to define a
member class or nested classWithin the scope of a class, the unqualified name Self can be used to refer to
the class itself.
Members of a class are accessed using the dot
(.) notation, so given an instance dial of type Widget, dial.payload
refers to its payload field.
Both structural data classes and nominal classes are considered class types, but they are commonly referred to as "structs" and "classes" respectively when that is not confusing. Like structs, classes refer to their members by name. Unlike structs, classes are nominal types.
References:
- Classes
- Proposal #722: Nominal classes and methods
- Proposal #989: Member access expressions
There is an implicit conversions defined between a struct literal and a class type with the same fields, in any scope that has access to all of the class' fields. This may be used to assign or initialize a variable with a class type, as in:
var sprocket: Widget = {.x = 3, .y = 4, .payload = "Sproing"};
sprocket = {.x = 2, .y = 1, .payload = "Bounce"};
References:
Classes may also contain class functions. These are functions that are accessed as members of the type, like static member functions in C++, as opposed to methods that are members of instances. They are commonly used to define a function that creates instances. Carbon does not have separate constructors like C++ does.
class Point {
// Class function that instantiates `Point`.
// `Self` in class scope means the class currently being defined.
fn Origin() -> Self {
return {.x = 0, .y = 0};
}
var x: i32;
var y: i32;
}
Note that if the definition of a function is provided inside the class scope, the body is treated as if it was defined immediately after the outermost class definition. This means that members such as the fields will be considered defined even if their definitions are later in the source than the class function.
The returned var feature can be used if the address of the
instance being created is needed in a factory function, as in:
class Registered {
fn Create() -> Self {
returned var result: Self = {...};
StoreMyPointerSomewhere(&result);
return var;
}
}
This approach can also be used for types that can't be copied or moved.
Class type definitions can include methods:
class Point {
// Method defined inline
fn Distance[me: Self](x2: i32, y2: i32) -> f32 {
var dx: i32 = x2 - me.x;
var dy: i32 = y2 - me.y;
return Math.Sqrt(dx * dx - dy * dy);
}
// Mutating method
fn Offset[addr me: Self*](dx: i32, dy: i32);
var x: i32;
var y: i32;
}
// Out-of-line definition of method declared inline.
fn Point.Offset[addr me: Self*](dx: i32, dy: i32) {
me->x += dx;
me->y += dy;
}
var origin: Point = {.x = 0, .y = 0};
Assert(Math.Abs(origin.Distance(3, 4) - 5.0) < 0.001);
origin.Offset(3, 4);
Assert(origin.Distance(3, 4) == 0.0);
This defines a Point class type with two integer data members x and y and
two methods Distance and Offset:
me parameter inside square
brackets [...] before the regular explicit parameter list in parens
(...).origin.Distance(...)
and origin.Offset(...).Distance computes and returns the distance to another point, without
modifying the Point. This is signified using [me: Self] in the method
declaration.origin.Offset(...) does modify the value of origin. This is signified
using [addr me: Self*] in the method declaration.Distance, or lexically out
of line like Offset.References:
- Methods
- Proposal #722: Nominal classes and methods
Classes by default are
final,
which means they may not be extended. A class may be declared as allowing
extension using either the base class or abstract class introducer instead
of class. An abstract class is a base class that may not itself be
instantiated.
base class MyBaseClass { ... }
Either kind of base class maybe extended to get a derived class. Derived
classes are final unless they are themselved declared base or abstract.
Classes may only extend a single class. Carbon only supports single inheritance,
and will use mixins instead of multiple inheritance.
base class MiddleDerived extends MyBaseClass { ... }
class FinalDerived extends MiddleDerived { ... }
// ❌ Forbidden: class Illegal extends FinalDerived { ... }
A base class may define virtual methods. These are methods whose implementation may be overridden in a derived class. By default methods are non-virtual, the declaration of a virtual methods must be prefixed by one of these three keywords:
virtual has a definition in this class but not in any
base.abstract does not have have a definition in this class,
but must have a definition in any non-abstract derived class.impl has a definition in this class, overriding any
definition in a base class.A pointer to a derived class may be cast to a pointer to one of its base
classes. Calling a virtual method through a pointer to a base class will use the
overridden definition provided in the derived class. Base classes with virtual
methods may use
run-time type information
in a match statement to dynamically test whether the dynamic type of a value is
some derived class, as in:
var base_ptr: MyBaseType* = ...;
match (base_ptr) {
case dyn p: MiddleDerived* => { ... }
}
For purposes of construction, a derived class acts like its first field is
called base with the type of its immediate base class.
class MyDerivedType extends MyBaseType {
fn Create() -> MyDerivedType {
return {.base = MyBaseType.Create(), .derived_field = 7};
}
var derived_field: i32;
}
Abstract classes can't be instantiated, so instead they should define class
functions returning partial Self. Those functions should be marked
protected so they may only be used by derived classes.
abstract class AbstractClass {
protected fn Create() -> partial Self {
return {.field_1 = 3, .field_2 = 9};
}
// ...
var field_1: i32;
var field_2: i32;
}
// ❌ Error: can't instantiate abstract class
var abc: AbstractClass = ...;
class DerivedFromAbstract extends AbstractClass {
fn Create() -> Self {
// AbstractClass.Create() returns a
// `partial AbstractClass` that can be used as
// the `.base` member when constructing a value
// of a derived class.
return {.base = AbstractClass.Create(),
.derived_field = 42 };
}
var derived_field: i32;
}
References:
- Inheritance
- Proposal #777: Inheritance
- Proposal #820: Implicit conversions
Class members are by default publicly accessible. The private keyword prefix
can be added to the member's declaration to restrict it to members of the class
or any friends. A private virtual or private abstract method may be
implemented in derived classes, even though it may not be called.
Friends may be declared using a friend declaration inside the class naming an
existing function or type. Unlike C++, friend declarations may only refer to
names resolvable by the compiler, and don't act like forward declarations.
protected is like private, but also gives access to derived classes.
References:
- Access control for class members
- Question-for-leads issue #665:
privatevspublicsyntax strategy, as well as other visibility tools likeexternal/api/etc.- Question-for-leads issue #971: Private interfaces in public API files
A destructor for a class is custom code executed when the lifetime of a value of
that type ends. They are defined with the destructor keyword followed by
either [me: Self] or [addr me: Self*] (as is done with methods)
and the block of code in the class definition, as in:
class MyClass {
destructor [me: Self] { ... }
}
or:
class MyClass {
// Can modify `me` in the body.
destructor [addr me: Self*] { ... }
}
The destructor for a class is run before the destructors of its data members. The data members are destroyed in reverse order of declaration. Derived classes are destroyed before their base classes.
A destructor in a abstract or base class may be declared virtual like with
methods. Destructors in classes derived from one with a virtual
destructor must be declared with the impl keyword prefix. It is illegal to
delete an instance of a derived class through a pointer to a base class unless
the base class is declared virtual or impl. To delete a pointer to a
non-abstract base class when it is known not to point to a value with a derived
type, use UnsafeDelete.
References:
- Destructors
- Proposal #1154: Destructors
A choice type is a tagged union,
that can store different types of data in a storage space that can hold the
largest. A choice type has a name, and a list of cases separated by commas
(,). Each case has a name and an optional parameter list.
choice IntResult {
Success(value: i32),
Failure(error: String),
Cancelled
}
The value of a choice type is one of the cases, plus the values of the parameters to that case, if any. A value can be constructed by naming the case and providing values for the parameters, if any:
fn ParseAsInt(s: String) -> IntResult {
var r: i32 = 0;
for (c: i32 in s) {
if (not IsDigit(c)) {
// Equivalent to `IntResult.Failure(...)`
return .Failure("Invalid character");
}
// ...
}
return .Success(r);
}
Choice type values may be consumed using a match statement:
match (ParseAsInt(s)) {
case .Success(value: i32) => {
return value;
}
case .Failure(error: String) => {
Display(error);
}
case .Cancelled => {
Terminate();
}
}
They can also represent an enumerated type, if no additional data is associated with the choices, as in:
choice LikeABoolean { False, True }
References:
- Proposal #157: Design direction for sum types
- Proposal #162: Basic Syntax
Name paths in Carbon always start with the package name. Additional namespaces may be specified as desired.
For example, this code declares a class Geometry.Shapes.Flat.Circle in a
library Geometry/OneSide:
package Geometry library("OneSide") namespace Shapes;
namespace Flat;
class Flat.Circle { ... }
This type can be used from another package:
package ExampleUser;
import Geometry library("OneSide");
fn Foo(Geometry.Shapes.Flat.Circle circle) { ... }
References:
- Code and name organization
- Proposal #107: Code and name organization
- Proposal #752: api file default public
- Question-for-leads issue #1136: what is the top-level scope in a source file, and what names are found there?
Various constructs introduce a named entity in Carbon. These can be functions, types, variables, or other kinds of entities. A name in Carbon is formed from a word, which is a sequence of letters, numbers, and underscores, and which starts with a letter. We intend to follow Unicode's Annex 31 in selecting valid identifier characters, but a concrete set of valid characters has not been selected yet.
References: Lexical conventions
TODO: References need to be evolved.
Our naming conventions are:
UpperCamelCase will be used when the named entity cannot have a
dynamically varying value. For example, functions, namespaces, or
compile-time constant values.lower_snake_case will be used when the named entity's value won't be
known until runtime, such as for variables.lower_snake_case.References:
Carbon provides a facility to declare a new name as an alias for a value. This is a fully general facility because everything is a value in Carbon, including types.
For example:
alias MyInt = i32;
This creates an alias called MyInt for whatever i32 resolves to. Code
textually after this can refer to MyInt, and it will transparently refer to
i32.
References:
- Aliases
- Question-for-leads issue #749: Alias syntax
TODO: References need to be evolved.
Unqualified name lookup will always find a file-local result, including aliases, or names that are defined as part of the prelude. There is no prioritization of scopes. This means that all relevant scopes are searched, and if the name is found multiple times referring to different entities, then it is an error. The error may be resolved by adding qualification to disambiguate the lookup.
When defining a member of a class, like a method, then the other members of the class' scope are searched as part of name lookup, even when that member is being defined out-of-line.
References:
TODO: References need to be evolved.
Common types that we expect to be used universally will be provided for every
file, including i32 and bool. These will likely be defined in a special
"prelude" package.
References:
- Name lookup
- Question-for-leads issue #750: Naming conventions for Carbon-provided features
- Question-for-leads issue #1058: How should interfaces for core functionality be named?
TODO: References need to be evolved.
TODO:
References:
Generics allow Carbon constructs like functions and
classes to be written with compile-time parameters and apply
generically to different types using those parameters. For example, this Min
function has a type parameter T that can be any type that implements the
Ordered interface.
fn Min[T:! Ordered](x: T, y: T) -> T {
// Can compare `x` and `y` since they have
// type `T` known to implement `Ordered`.
return if x <= y then x else y;
}
var a: i32 = 1;
var b: i32 = 2;
// `T` is deduced to be `i32`
Assert(Min(a, b) == 1);
// `T` is deduced to be `String`
Assert(Min("abc", "xyz") == "abc");
Since the T type parameter is in the deduced parameter list in square brackets
([...]) before the explicit parameter list in parentheses ((...)), the
value of T is determined from the types of the explicit arguments instead of
being passed as a separate explicit argument.
References: TODO: Revisit
- Generics: Overview
- Proposal #524: Generics overview
- Proposal #553: Generics details part 1
- Proposal #950: Generic details 6: remove facets
The :! indicates that T is a checked parameter passed at compile time.
"Checked" here means that the body of Min is type checked when the function is
defined, independent of the specific type values T is instantiated with, and
name lookup is delegated to the constraint on T (Ordered in this case). This
type checking is equivalent to saying the function would pass type checking
given any type T that implements the Ordered interface. Then calls to Min
only need to check that the deduced type value of T implements Ordered.
The parameter could alternatively be declared to be a template parameter by
prefixing with the template keyword, as in template T:! Type.
fn Convert[template T:! Type](source: T, template U:! Type) -> U {
var converted: U = source;
return converted;
}
fn Foo(i: i32) -> f32 {
// Instantiates with the `T` implicit argument set to `i32` and the `U`
// explicit argument set to `f32`, then calls with the runtime value `i`.
return Convert(i, f32);
}
Carbon templates follow the same fundamental paradigm as C++ templates: they are instantiated when called, resulting in late type checking, duck typing, and lazy binding.
Member lookup into a template type parameter is done in the actual type value
provided by the caller, in addition to any constraints. This means member name
lookup and type checking for anything
dependent on the template parameter
can't be completed until the template is instantiated with a specific concrete
type. When the constraint is just Type, this gives semantics similar to C++
templates. Constraints can then be added incrementally, with the compiler
verifying that the semantics stay the same. Once all constraints have been
added, removing the word template to switch to a checked parameter is safe.
Although checked generics are generally preferred, templates enable translation of code between C++ and Carbon, and address some cases where the type checking rigor of generics are problematic.
References:
- Templates
- Proposal #553: Generics details part 1
- Question-for-leads issue #949: Constrained template name lookup
- Proposal #989: Member access expressions
Interfaces specify a set of requirements that a types might satisfy. Interfaces act both as constraints on types a caller might supply and capabilities that may be assumed of types that satisfy that constraint.
interface Printable {
// Inside an interface definition `Self` means
// "the type implementing this interface".
fn Print[me: Self]();
}
In addition to function requirements, interfaces can contain:
final interface membersTypes only implement an interface if there is an explicit impl declaration
that they do. Simply having a Print function with the right signature is not
sufficient.
class Circle {
var radius: f32;
impl as Printable {
fn Print[me: Self]() {
Console.WriteLine("Circle with radius: {0}", me.radius);
}
}
}
In this case, Print is a member of Circle. Interfaces may also be
implemented externally, which means the
members of the interface are not direct members of the type. Those methods may
still be called using
compound member access syntax
to qualify the name of the member, as in x.(Printable.Print)(). External
implementations don't have to be in the same library as the type definition,
subject to the orphan rule (1,
2) for
coherence.
Interfaces and implementations may be
forward declared
by replacing the definition scope in curly braces ({...}) with a semicolon.
References:
- Generics: Interfaces
- Generics: Implementing interfaces
- Proposal #553: Generics details part 1
- Proposal #731: Generics details 2: adapters, associated types, parameterized interfaces
- Proposal #624: Coherence: terminology, rationale, alternatives considered
- Proposal #990: Generics details 8: interface default and final members
- Proposal #1084: Generics details 9: forward declarations
A function can require calling types to implement multiple interfaces by
combining them using an ampersand (&):
fn PrintMin[T:! Ordered & Printable](x: T, y: T) {
// Can compare since type `T` implements `Ordered`.
if (x <= y) {
// Can call `Print` since type `T` implements `Printable`.
x.Print();
} else {
y.Print();
}
}
The body of the function may call functions that are in either interface, except for names that are members of both. In that case, use the compound member access syntax to qualify the name of the member, as in:
fn DrawTies[T:! Renderable & GameResult](x: T) {
if (x.(GameResult.Draw)()) {
x.(Renderable.Draw)();
}
}
References:
- Combining interfaces by anding type-of-types
- Question-for-leads issue #531: Combine interfaces with
+or&- Proposal #553: Generics details part 1
An associated type is a type member of an interface whose value is determined by
the implementation of that interface for a specific type. These values are set
to compile-time values in implementations, and so use the
:! generic syntax inside a
let declaration without an initializer. This
allows types in the signatures of functions in the interface to vary. For
example, an interface describing a
stack might use an
associated type to represent the type of elements stored in the stack.
interface StackInterface {
let ElementType:! Movable;
fn Push[addr me: Self*](value: ElementType);
fn Pop[addr me: Self*]() -> ElementType;
fn IsEmpty[addr me: Self*]() -> bool;
}
Then different types implementing StackInterface can specify different type
values for the ElementType member of the interface using a where clause:
class IntStack {
impl as StackInterface where .ElementType == i32 {
fn Push[addr me: Self*](value: i32);
// ...
}
}
class FruitStack {
impl as StackInterface where .ElementType == Fruit {
fn Push[addr me: Self*](value: Fruit);
// ...
}
}
References:
Many Carbon entities, not just functions, may be made generic by adding checked or template parameters.
Classes may be defined with an optional explicit parameter list. All parameters
to a class must be generic, and so defined with :!, either with or without the
template prefix. For example, to define a stack that can hold values of any
type T:
class Stack(T:! Type) {
fn Push[addr me: Self*](value: T);
fn Pop[addr me: Self*]() -> T;
var storage: Array(T);
}
var int_stack: Stack(i32);
In this example:
Stack is a type parameterized by a type T.T may be used within the definition of Stack anywhere a normal type
would be used.Array(T) instantiates generic type Array with its parameter set to T.Stack(i32) instantiates Stack with T set to i32.The values of type parameters are part of a type's value, and so may be deduced in a function call, as in this example:
fn PeekTopOfStack[T:! Type](s: Stack(T)*) -> T {
var top: T = s->Pop();
s->Push(top);
return top;
}
// `int_stack` has type `Stack(i32)`, so `T` is deduced to be `i32`.
PeekTopOfStack(&int_stack);
References:
Choice types may be parameterized similarly to classes:
choice Result(T:! Type, Error:! Type) {
Success(value: T),
Failure(error: Error)
}
Interfaces are always parameterized by a Self type, but in some cases they
will have additional parameters.
interface AddWith(U:! Type);
Interfaces without parameters may only be implemented once for a given type, but
a type can have distinct implementations of AddWith(i32) and
AddWith(BigInt).
Parameters to an interface determine which implementation is selected for a type, in contrast to associated types which are determined by the implementation of an interface for a type.
References:
An impl declaration may be parameterized by adding forall [generic
parameter list] after the impl keyword introducer, as in:
external impl forall [T:! Printable] Vector(T) as Printable;
external impl forall [Key:! Hashable, Value:! Type]
HashMap(Key, Value) as Has(Key);
external impl forall [T:! Ordered] T as PartiallyOrdered;
external impl forall [T:! ImplicitAs(i32)] BigInt as AddWith(T);
external impl forall [U:! Type, T:! As(U)]
Optional(T) as As(Optional(U));
Generic implementations can create a situation where multiple impl definitions
apply to a given type and interface query. The
specialization rules
pick which definition is selected. These rules ensure:
Implementations may be marked final to
indicate that they may not be specialized, subject to
some restrictions.
References:
Carbon generics have a number of other features, including:
where constraints.dyn-safe".References:
observe declarationsDetermining whether two types must be equal in a generic context is in general undecidable, as has been shown in Swift.
To make compilation fast, the Carbon compiler will limit its search to a depth
of 1, only identifying types as equal if there is an explicit declaration that
they are equal in the code, such as in a
where constraint. There will be
situations where two types must be equal as the result of combining these facts,
but the compiler will return a type error since it did not realize they are
equal due to the limit of the search. An
observe...== declaration may be
added to describe how two types are equal, allowing more code to pass type
checking.
An observe declaration showing types are equal can increase the set of
interfaces the compiler knows that a type implements. It is also possible that
knowing a type implements one interface implies that it implements another, from
an
interface requirement
or generic implementation. An observe...is
declaration may be used to
observe that a type implements an interface.
References:
Uses of an operator in an expression is translated into a call
to a method of an interface. For example, if x has type T and y has type
U, then x + y is translated into a call to x.(AddWith(U).Op)(y). So
overloading of the + operator is accomplished by implementing interface
AddWith(U) for type T. In order to support
implicit conversion of the first operand
to type T and the second argument to type U, add the like keyword to both
types in the impl declaration, as in:
external impl like T as AddWith(like U) where .Result == V {
// `Self` is `T` here
fn Op[me: Self](other: U) -> V { ... }
}
When the operand types and result type are all the same, this is equivalent to
implementing the Add interface:
external impl T as Add {
fn Op[me: Self](other: Self) -> Self { ... }
}
The interfaces that correspond to each operator are given by:
-x: Negatex + y: Add or AddWith(U)x - y: Sub or SubWith(U)x * y: Mul or MulWith(U)x / y: Div or DivWith(U)x % y: Mod or ModWith(U)^x: BitComplementx & y: BitAnd or BitAndWith(U)x | y: BitOr or BitOrWith(U)x ^ y: BitXor or BitXorWith(U)x << y: LeftShift or LeftShiftWith(U)x >> y: RightShift or RightShiftWith(U)x == y, x != y overloaded by implementing
Eq or EqWith(U)x < y, x > y, x <= y, x >= y overloaded by implementing
Ordered or OrderedWith(U)x as U is rewritten to use the
As(U) interfaceImplicitAs(U)*pa[3]f(4)The logical operators can not be overloaded.
References:
- Operator overloading
- Proposal #702: Comparison operators
- Proposal #820: Implicit conversions
- Proposal #845: as expressions
- Question-for-leads issue #1058: How should interfaces for core functionality be named?
- Proposal #1083: Arithmetic expressions
- Proposal #1191: Bitwise operators
- Proposal #1178: Rework operator interfaces
There are some situations where the common type for two types is needed:
if c then t else f
returns a value with the common type of t and f.If there are multiple parameters to a function with a type parameter, it will be set to the common type of the corresponding arguments, as in:
fn F[T:! Type](x: T, y: T);
// Calls `F` with `T` set to the
// common type of `G()` and `H()`:
F(G(), H());
The inferred return type of a function with
auto return type is the common type of its return
statements.
The common type is specified by implementing the CommonTypeWith interface:
// Common type of `A` and `B` is `C`.
impl A as CommonTypeWith(B) where .Result == C { }
The common type is required to be a type that both types have an implicit conversion to.
References:
ifexpressions- Proposal #911: Conditional expressions
- Question-for-leads issue #1077: find a way to permit impls of CommonTypeWith where the LHS and RHS type overlap
TODO: Needs a detailed design and a high level summary provided inline.
References:
TODO: References need to be evolved.
TODO:
References: Safety strategy
TODO: References need to be evolved. Needs a detailed design and a high level summary provided inline.
References: Pattern matching
TODO:
TODO: References need to be evolved. Needs a detailed design and a high level summary provided inline.
Carbon provides metaprogramming facilities that look similar to regular Carbon code. These are structured, and do not offer arbitrary inclusion or preprocessing of source text such as C/C++ does.
References: Metaprogramming
Carbon provides some higher-order abstractions of program execution, as well as the critical underpinnings of such abstractions.
TODO:
TODO:
TODO: