extern keywordexternimpl files with a forward declaration must contain the definitionextern declarations in extern implimpl lookup involving extern typesextern keyword for forward declarations in libraries that don't
provide the definition.A forward declaration can be merged with a definition when they match. However, there is ambiguity about behavior:
impl and interface.extern here.alias would resolve forward
declaration conflicts. This continued to forward declarations in general,
and has discussion about extern use.
Example rules for forward declarations in the current design:
C++'s ODR requires each entity to have only one definition. C++ has trouble detecting issues at compile time, and linking has trouble catching linker issues too. ODR violation detection is an active problem (https://maskray.me/blog/2022-11-13-odr-violation-detection).
This similarly applies to Carbon. In Carbon, only one library can define an
entity; other libraries can make forward declarations of it, but those don't
constitute a definition. Part of the goal in declaration syntax (extern in
particular) is to be able to better diagnose ODR violations based on
declarations.
When two declarations have the same name, either within the same file or through imports, they need to "match", or will be diagnosed as a conflict. This is laid out at under Matching and agreeing in generics. This proposal does not significantly affect these type and name restrictions.
For example:
fn F(); fn F() {} could merge because the function signatures match,
forward declaring a function is valid, and only one definition is provided.fn F(); fn F(x: i32); won't merge because they differ in parameters,
resulting in a diagnostic.extern declarations.
extern is used to mark forward declarations of entities defined in a
different library.
api
file.extern.extern must only be used for declarations in libraries that don't
contain the definition.extern comes immediately after access control keywords,
and before other keywords.extern anywhere in a file means it must be used
as extern throughout that file.
extern
declaration, even if there is an imported declaration.extern.extern in either the api or
impl, it is invalid to declare the same entity as extern elsewhere
within the library.extern keywordAn extern modifier keyword is added. It modifies a forward declaration to
indicate that the entity is not defined by the declaring library. A library
using extern to declare an entity cannot define that entity. Only the library
which defines an entity can omit extern, and omitting extern means it must
provide a definition.
An extern declaration forms an entity which has no definition or storage. For
example:
extern fn forms a function that can be called.extern class forms an incomplete type.extern interface forms an undefined interface.extern var or extern let will bind the name without allocating storage.
Initializers are disallowed.The extern keyword is invalid on declarations that have only a declaration
syntax and lack storage, such as alias or namespace. It is only valid on
namespace-scoped names; it is invalid on type-scoped names
(class Foo { extern fn Member(); }).
In declaration modifiers, extern comes immediately after access control
keywords, and before other keywords. For example,
private extern <other modifiers> class B;. At present, when extern is on a
declaration, only access modifiers are valid (see
Modifier keywords).
When considering whether a declaration is allowed, we apply the rules:
extern.api files without
affecting compilation of client libraries.In a file, a forward declaration must never follow a forward declaration or definition for the same entity.
For example:
class A { ... }
// Invalid: Disallowed after the definition.
class A;
class B;
// Invalid: Disallowed due to repetition.
class B;
class C;
// Valid: Allowed because the definition is added.
class C { ... }
In a file, if a declaration or definition of an entity is imported, the file must choose between either using that version or declaring its own. It cannot do both.
For example:
package Foo library "a" api;
class C { ... }
package Foo library "b" api;
import library "a";
extern class C;
// Valid: Uses the incomplete type of the extern declaration.
fn Foo(c: C*);
package Foo library "c" api;
import library "a";
// Valid: Uses the complete type of the imported definition.
fn Foo(c: C);
package Foo library "d" api;
import library "a";
fn Foo(c: C);
// Invalid: `Foo` used the imported `C`, so an `extern` declaration of `C` is now
// invalid.
extern class C;
externIn a library, if the impl defines an entity, the api must not use extern
when declaring it.
For example:
package Wiz library "a" api;
extern class C;
package Wiz library "a" impl;
// Invalid: The `api` file declared `C` as `extern`.
class C { ... }
In a library, the api might make an extern declaration that the impl
imports and uses the definition of. This is consistent because the impl file
is not declaring the entity.
impl files with a forward declaration must contain the definitionIn an impl file, if the impl file forward declares an entity, it must also
provide the definition. In libraries with multiple impl files, this means that
using an entity in one impl file when it's defined in a different impl file
requires a (possibly private) forward declaration in the api file. An
extern declaration cannot be used for this purpose because the library defines
the entity.
This allows Carbon to provide a compile-time diagnostic if an entity declared in
the impl is not defined locally. Note that an entity declared in the api may
still not get a compile-time diagnostic unless the compiler is told it's seeing
all available impl files.
For example:
package Bar library "a" api;
class C;
class D { ... }
package Bar library "a" impl;
// Invalid: Missing a definition in the `impl` file, but if one were added, then
// this would be valid.
class C;
// Invalid: The `api` defines `D`. As a consequence, there is no way to make
// this forward declaration valid.
class D;
This doesn't prevent impl from providing a forward declaration. It might when
it also provides the definition, which can be useful to unravel dependency
cycles:
package Bar library "a" api;
class D;
package Bar library "a" impl;
class D;
class E {
fn F[self: Self](d: D*) { ... }
}
class D {
fn G[self: Self](e: E) { ... }
}
Here, the impl could not use the imported forward declaration in the api
because of the rule
Files must either use an imported declaration or declare their own.
Without class D; present, the definition of F would be invalid.
The combination of a forward declaration and a definition is allowed in type scopes.
For example:
class C {
class D;
fn F() -> D;
class D {
fn G() -> Self { return C.F(); }
var x: i32;
}
fn F() -> D { return {.x = 42}; }
}
This is necessary because type bodies are not automatically moved out-of-line, unlike function bodies.
extern declarations in extern implIt is invalid for a non-extern impl declaration to use an extern type in
its type structure.
Consider two libraries, one defining A and declaring B as extern, and the
other defining B and declaring A as extern. Neither should be able to
define an impl involving both A and B, otherwise both could.
impl lookup involving extern typesIf impl lookup involving extern types finds a non-final parameterized
impl, the result is that the lookup succeeds, but none of the values of the
associated entities of interface are known. This is because there may be another
more specialized impl that applies that is not visible (as can also happen
with constrained generics).
For example:
library "a" api;
extern class C;
extern class D(T:! type);
extern impl forall [T:! type] D(T) as I where .Result = i32;
In the above, D(C) impls I, but with unknown .Result, since it might not
be i32.
When considering various modifiers on a forward declaration versus definition:
extern is only valid on a forward declaration. Rules are detailed above.extend in extend impl is only on the declaration in the class body
(whether that is a forward declaration or definition), as described at
Forward impl declaration.abstract, base, final)
exist only on the definition, not on the forward declaration.impl, virtual, default, abstract, final) must
match between forward declaration and definition.
abstract won't have a definition so is moot here.private and protected) must match.
extern name may be private when the actual name
is public.
private extern declarations.extern declaration when a non-extern
declaration is imported allows adding the class to an
already-imported library where extern declarations were previously
present.extern declaration prevents
accidental uses that might hinder removing the class from an
imported library.extern declarations only when they affect
calling conventions means that, in most cases, keywords can be added and
removed from declarations in the defining library without breaking
extern declarations in other libraries.extern will assist readers in understanding when a
library is working with a type in order to avoid dependency loops, with
minimal impact on writing code.extern declarations be clearly marked should improve our
ability to diagnose ODR violations. This will help developers by
improving detection of a subtle correctness issue.extern declarations are considered essential to supporting separate
compilation of libraries, which in turn supports scaling compilation.extern is chosen for consistency with C++, and carries a similar --
albeit slightly different -- meaning.There has been intermittent discussion about which modifiers to allow or require on forward declarations versus definitions. There are advantages and disadvantages about redundancy and being able to copy-paste declarations. There might be strict requirements for some modifiers to be present in order to correctly use a forward declaration.
This proposal suggests a partial decision here, at least for a reasonable starting point that we can implement towards. This will likely evolve in future proposals, particularly as more keywords are added. However, this still offers a baseline.
The trade-offs we consider are:
base is added to class C { ... }, and disallowed on
class C;.base were optional to allow base class C;, then
an adjacent class D; lends itself to being incorrectly interpreted
as meaning "D is not a base class" when it actually means "D may
or may not be a base class".extern, and should have more
nuanced rules if the access control rules go beyond public and
private. For example, a package-private type shouldn't be allowed to
be made public through an extern declaration; but package-private
isn't actually part of the language right now.abstract, base, final, and default)
have no effect on uses because the forward declared type is incomplete.
extern is a special-case where its presence is intrinsic to the
keyword's semantics.For type-scoped members, we are choosing to duplicate modifiers between the declaration and definition.
For example:
class A {
private class B;
}
// `private` is required here.
private class A.B { ... }
There is more emphasis on being able to copy-paste a function
declaration. This in particular may help developers more than classes
because all the parameters must also be copied. Things such as static
in C++ being declaration-only is also something we view as a source of
friction rather than a benefit.
Modifiers such as virtual affect the calling convention of functions,
and as a consequence must be on the first declaration.
A downside of this approach is that it means the class name is inserted in the middle of the out-of-scope definition, rather than near the front.
The most likely alternative would be to disallow most modifiers on out-of-line
definitions after a forward declaration, for type member functions in specific.
This would be because, unlike other situations, a member must have a forward
declaration if there is an out-of-line definition. We would probably want to
drop these collectively in order to maximize copy-paste ability (ideally,
everything before fn is dropped). However, it would shift understandability of
the definition in a way that may be harmful. For now this proposal suggests
adopting the more verbose approach and seeing how it goes.
Another alternative is that we could allow flexibility to choose which modifier keywords are provided where. For example, keywords on a function definition must be some subset of the keywords on the declaration; keywords on a type declaration must be some subset of the keywords on the definition. This would allow authors to choose when they expect keywords will be most relevant. However, it could also serve to detract from readability: two functions in the same file might be declared similarly, but have different keywords on the definition, implying a difference in behavior that would not exist. With consideration for the Principle: Prefer providing only one way to do a given thing, this proposal takes the more prescriptive approach, rather than offering flexibility.
Note a common aspect between types and functions in the proposed model is that
modifiers on the definition are typically a superset of modifiers on the
declaration (extern as an exception). While looking at the declaration always
gives an incomplete view, looking at the definition can give a complete view.
extern keywordIf we had no extern keyword, then a declaration wouldn't give any hint of
whether a library contains the definition. Given two forward declarations in
different api files, either both or neither could have a definition in their
impl file. Sometimes this would be detected during linking, particularly if
both are linked together. However, providing an extern keyword gives a hint
about the intended behavior, allowing us to evaluate more cases for warnings. It
also gives a hint to the reader, about whether an entity is expected to be
declared later in the library (even if not in the same file).
For example:
package Foo library "a" api;
class C;
package Foo library "a" impl;
class C {}
package Foo library "b" api;
import library "a";
class C {};
Without the extern keyword, this code should be expected to compile. Ideally
it would be caught during linking that there are two definitions of class C,
but that relies on some tricks to catch issues. When extern is added, then the
processing of library "b" results in a conflict between the class C; forward
declared in the api of library "a".
Note this probably does not fundamentally alter the amount that can be diagnosed, but will mainly allow some diagnostics to occur during compilation that otherwise would either be linker diagnostics or missed.
The extern keyword is being added mainly for diagnostics and readability.
We are being restrictive with declarations and when they're allowed. Primarily, we want to avoid confusion with code such as:
class C { ... }
// This declaration has no effect.
class C;
But, we could also allow code such as:
package Foo library "a" api;
class C { ... }
package Foo library "b" api;
import library "a";
fn F(c: C) { ... }
extern class C;
fn G(c: C*) { ... }
Here, F requires the imported definition of C. But, is G seeing an
incomplete type from the extern, or is the extern redundant and G sees the
imported definition of C? Would a library importing library "b" see C as an
incomplete type, or a complete type?
In order to eliminate potential understandability issues with the choices we may
make, we are choosing the more restrictive approaches which disallow both of
these. In the first case, the redeclaration after a definition in the same file
is simply disallowed. In the second case, library "b" cannot declare C as
extern after using the imported definition. Restrictions such as these should
make code clearer by helping developers catch redundant, and possibly incorrect,
code.
extern namingBeyond extern, we also considered external. extern implies external
linkage in C++. We're choosing extern mainly for the small consistency with
C++.
extern to privateThe extern keyword could have an access control implication equivalent to
private. Then extern would need explicit work to export the symbol. The
export keyword was proposed for this purpose, with the idea that
export import syntax might also be provided to re-export all symbols of an
imported library.
extern and export semantic consistency
This would mean that extern declarations have different access control
than other declarations, which is a different visibility model to
understand, and may also not be intuitive from the "external" meaning. The
use of export matches C++'s export keyword, which may also imply to
developers that other semantics match C++, such as export being necessary
for all declared names.
Interaction with additional access control features
We'll probably also want more granular access control than just public and
private for API names. Adding package-private access modifier seems useful
(for example, Java provides this as package): in C++, this is sometimes
achieved through an "internal" or "details" namespace. If Carbon only
supports library-private symbols, that still addresses some of these
use-cases, but will sometimes require private implementation details to
exist in a single api file in order to get language-supported visibility
restrictions. In some cases this will result in an unwieldy amount of code
for a single file.
For example of how package-private visibility might be used, gtest has
an internal header directory
that contains thousands of lines of code. If this needed to migrate to api
files in Carbon, it would be ideal if users could not access the names by
importing an internal library.
For example of how this would interact:
| Default visibility | Public (proposed) | Private (alternative) |
|---|---|---|
| Public | extern class C; |
export extern class C; |
| Library-private | private extern class C; |
extern class C; |
| Package-private (hypothetical) | package extern class C; |
export package extern class C; |
Risks of a public extern
When making an extern fn, it is callable. This creates a risk of a
function being declared as extern for internal use, but accidentally
allowing dependencies on the function. This risk could be mitigated by
requiring access control of an extern to be either equal to or more
restrictive than the original symbol (which might be hard to validate, but
could be validated when both symbols are seen together).
When making an extern class, it's an incomplete type. It cannot be
instantiated, but pointers and references may be declared. This is only
really useful if there are functions which take a pointer as a parameter,
but an instance of the pointer could only be created by either unsafe casts
or if there's a function that returns the pointer type.
Unsafe casts carry an inherent risk. In the case of a returned pointer, that
type could be captured by auto. Having extern class default to private
does not prevent the type's use.
Considering the trade-offs involved combination of these three points, this proposal suggests using the regular access control semantic. That choice may be reconsidered later, based on how the semantics work out and any issues that arise, particularly around access control.
Omitting the extern modifier means a definition is required in a library.
There may be a use-case for opaque types which are "owned" by a library and have
no definition. If so, there are possible solutions such as a modifier keyword to
indicate an opaque type. For now, an empty definition in the impl file of a
library should have a similar effect: api users would not be able to provide a
definition.
For example:
package Foo library "a" api;
// An opaque type which can be imported by other libraries.
class C;
package Foo library "a" impl;
// An empty definition. This could be in its own file, or at the end after logic,
// to prevent misuse.
class C {}
This proposal requires a definition to exist. That choice may be reconsidered later, based on use-cases.
extern declarationsAs proposed, any library can provide extern declarations for other libraries
in the same package. It was proposed that this should be restricted so that a
library would need to make extern declarations available, either through a
separate extern file (similar to api and impl) or through additional
markup in the api file which could be used to automatically generate an
extern-only subset of the api (still requiring entities which should be
extern to be explicitly marked).
Advantages:
extern declarations.
extern
declarations. This goes a step further, requiring the extern
declaration be provided by the same library that's defining the entity,
providing a clear, central ownership.extern declarations
agree with the library's declaration of the definition. This includes
small details such as parameter names. A consequence of this is that
changing these details in one declaration requires changing them in
all declarations, atomically. There's a desire to limit the scope of
atomic refactorings; for example, similar package-scope atomic
refactoring requirements lead us to allow certain redundant forward
declarations elsewhere in this proposal.Disadvantages:
extern declarations to
use minimal imports for a declaration.
extern declarations, the extern
file's imports would be a superset of the imports for those
declarations. If a client library only wants one of those extern
declarations, it would still get the full set of dependencies; under the
proposed syntax, only the single declaration's dependencies are
required. This allows for fewer dependencies.class C { ... }; extern fn F(c: C);. Under the proposed syntax, this
is supported; under the alternative syntax, another solution would need
to be found. There are potential ways to fix this through refactoring,
including moving one entity to a different library, changing the
extern declaration to use only extern declarations, or using generic
programming to create an indirection. However, each of these is a
refactoring that may hinder adoption.At present, the disadvantages are considered to outweigh the advantages. It's
possible that this may be revisited later if we get more code and libraries
using these tools and recognize some patterns that we can better or more
directly support. However, we should be hesitant to provide a second syntax for
extern if it's not adding substantial value, under
Principle: Prefer providing only one way to do a given thing.
extern declarationsWe could choose a syntax for extern declarations that allows cross-package
extern declarations. These are effectively supported in C++, where there are
no package boundaries. Dropping support will create a migration barrier.
However, there is a strong desire to restrict the use of cross-package
declarations in order to reduce the difficult and complex refactoring costs that
result from cross-package declarations: requiring that a particular name not
change its declaration category (a class must remain a class, preventing
alias) and that parameters (either function or generic) must remain the same,
not even allowing implicit conversions.
The package boundaries serve an important purpose in balancing costs for refactoring.
This is not proposing a particular approach to migration. However, for consideration of the proposal, it can be helpful to consider how migration will work.
It's expected under this approach that migration of a forward declaration will require identifying the Carbon library that defines the entity. Then:
extern is required.extern is added.extern declaration. This might be infeasible when the package is
not owned by the package being migrated.extern cpp <declaration>).