|
|
5 jaren geleden | |
|---|---|---|
| .. | ||
| README.md | 5 jaren geleden | |
| source_files.md | 5 jaren geleden | |
Important Carbon goals for code and name organization are:
Tooling support is important for Carbon, including the possibility of a package manager.
Developer tooling, including both IDEs and refactoring tools, are expected to exist and be well-supported.
Software and language evolution:
We should support libraries adding new structs, functions or other identifiers without those new identifiers being able to shadow or break existing users that already have identifiers with conflicting names.
We should make it easy to refactor code, including moving code between files. This includes refactoring both by humans and by developer tooling.
Fast and scalable development:
It should be easy for developer tooling to parse code, without needing to parse imports for context.
Structure should be provided for large projects to opt into features which will help maintain scaling of their codebase, while not adding burdens to small projects that don't need it.
Carbon source files have a .carbon extension, such as
geometry.carbon. These files are the basic unit of compilation.
Each file begins with a declaration of which
package[define]
it belongs in. The package is the unit of distribution. The package name is a
single identifier, such as Geometry. An example API file in the Geometry
package would start with:
package Geometry api;
A tiny package may consist of a single library with a single file, and not use
any further features of the package keyword.
It is often useful to use separate files for the API and its implementation.
This may help organize code as a library grows, or to let the build system
distinguish between the dependencies of the API itself and its underlying
implementation. Implementation files allow for code to be extracted out from the
API file, while only being callable from other files within the library,
including both API and implementation files. Implementation files are marked by
both naming the file to use an extension of .impl.carbon and instead start
with:
package Geometry impl;
However, as a package adds more files, it will probably want to separate out
into multiple
libraries[define].
A library is the basic unit of dependency. Separating code into multiple
libraries can speed up the overall build while also making it clear which code
is being reused. For example, an API file adding the library Shapes to the
Geometry package, or Geometry//Shapes in
shorthand, would start with:
package Geometry library "Shapes" api;
As code becomes more complex, and users pull in more code, it may also be
helpful to add
namespaces[define]
to give related entities consistently structured names. A namespace affects the
name
path[define]
used when calling code. For example, with no namespace, if a Geometry package
defines Circle then the name path will be Geometry.Circle. However, it can
be named Geometry.TwoDimensional.Circle with a namespace; for example:
package Geometry library "Shapes" api;
namespace TwoDimensional;
struct TwoDimensional.Circle { ... };
This scaling of packages into libraries and namespaces is how Carbon supports both small and large codebases.
A different way to think of the sizing of packages and libraries is:
Abseil package.Boost,
with dependencies on other libraries in Boost and potentially other
packages from Boost.
library "Algorithms", or include part of the path to reduce the
chance of name collisions, such as library "Geometry/Algorithms".Packages may choose to expose libraries that expose unions of interfaces from other libraries within the package. However, doing so would also provide the transitive closure of build-time dependencies, and is likely to be discouraged in many cases.
The import keyword supports reusing code from other files and libraries.
For example, to use Geometry.Circle from the Geometry//Shapes library:
import Geometry library "Shapes";
fn Area(Geometry.Circle circle) { ... };
The library keyword is optional for import, and its use should parallel that
of library on the package of the code being imported.
Every source file will consist of, in order:
package statement.import statements.Comments and blank lines may be intermingled with these sections. Metaprogramming code may also be intermingled, so long as the outputted code is consistent with the enforced ordering. Other types of code must be in the source file body.
Name paths are defined above as sequences of identifiers separated by dots. This syntax may be loosely expressed as a regular expression:
IDENTIFIER(\.IDENTIFIER)*
Name conflicts are addressed by name lookup.
package syntaxThe package keyword's syntax may be loosely expressed as a regular expression:
package IDENTIFIER (library STRING)? (api|impl);
For example:
package Geometry library "Objects/FourSides" api;
Breaking this apart:
package keyword, Geometry, is the package
name and will prefix both library and namespace paths.
package keyword also declares a package entity matching the
package name. A package entity is almost identical to a namespace
entity, except with some package/import-specific handling. In other
words, if the file declares struct Line, that may be used from within
the file as both Line directly and Geometry.TwoDimensional.Line
using the Geometry package entity created by the package keyword.library keyword is specified, sets the name of the
library within the package. In this example, the
Geometry//Objects/FourSides library will be used.api keyword indicates this is an API files as described
under libraries. If it instead had impl, this would be an
implementation file.Because the package keyword must be specified exactly once in all files, there
are a couple important and deliberate side-effects:
package Geometry api;. This could be considered equivalent to
package Geometry library "" api;, although we should not allow that
specific syntax as error-prone.package statement.Files contributing to the Geometry//Objects/FourSides library must all start
with package Geometry library "Objects/FourSides", but will differ on
api/impl types.
Library names may also be referred to as PACKAGE//LIBRARY as shorthand in
text. PACKAGE//default will refer to the name of the library used when no
library argument is specified, although PACKAGE may also be used in
situations where it is unambiguous that it still refers to the default library.
It's recommended that libraries use a single / for separators where desired,
in order to distinguish between the // of the package and / separating
library segments. For example, Geometry//Objects/FourSides uses a single /
to separate the Object/FourSides library name.
Because the package also declares a namespace entity with the same name, conflicts with the package name are possible. We do not support packages providing entities with the same name as the package.
For example, this is a conflict for DateTime:
package DateTime api;
struct DateTime { ... };
This declaration is important for implementation files, which implicitly import the library's API, because it keeps the package name as an explicit entity in source files.
Note that imported name conflicts are handled differently.
Every Carbon library consists of one or more files. Each Carbon library has a primary file that defines its API, and may optionally contain additional files that are implementation.
package will have api. For example,
package Geometry library "Shapes" api;
.carbon extension. They must not have a
.impl.carbon extension.package will have impl. For example,
package Geometry library "Shapes" impl;.
.impl.carbon extension.api imports.The difference between API and implementation will act as a form of access control. API files must compile independently of implementation, only importing from APIs from other libraries. API files are also visible to all files and libraries for import. Implementation files only see API files for import, not other implementation files.
When any file imports a library's API, it should be expected that the transitive closure of imported files from the primary API file will be a compilation dependency. The size of that transitive closure affects compilation time, so libraries with complex implementations should endeavor to minimize their API imports.
Libraries also serve as a critical unit of compilation. Dependencies between libraries must be clearly marked, and the resulting dependency graph will allow for separate compilation.
In order to actually be part of a library's API, entities must both be in the
API file and explicitly marked as an API. This is done using the api keyword,
which is only allowed in the API file. For example:
package Geometry library "Shapes" api;
// Circle is marked as an API, and will be available to other libraries as
// Geometry.Circle.
api struct Circle { ... }
// CircleHelper is not marked as an API, and so will not be available to other
// libraries.
fn CircleHelper(Circle circle) { ... }
// Only entities in namespaces should be marked as an API, not the namespace
// itself.
namespace Operations;
// Operations.GetCircumference is marked as an API, and will be available to
// other libraries as Geometry.Operations.GetCircumference.
api fn Operations.GetCircumference(Circle circle) { ... }
This means that an API file can contain all implementation code for a library. However, separate implementation files are still desirable for a few reasons:
Use of the api keyword is not allowed within files marked as impl.
The compilation graph of Carbon will generally consist of api files depending
on each other, and impl files depending only on api files. Compiling a given
file requires compiling the transitive closure of api files first.
Parallelization of compilation is then limited by how large that transitive
closure is, in terms of total volume of code rather than quantity. This also
affects build cache invalidation.
In order to maximize opportunities to improve compilation performance, we will
encourage granular libraries. Conceptually, we want libraries to be very small,
possibly containing only a single class. The choice of only allowing a single
api file per library should help encourage developers to write small
libraries.
Any entity may be marked with api except for namespace and package entities.
That is, api namespace Sha256; is invalid code. Instead, namespaces are
implicitly exported based on the name paths of other entities marked as api.
For example, given this code:
package Checksums library "Sha" api;
namespaces Sha256;
api fn Sha256.HexDigest(Bytes data) -> String { ... }
Calling code may look like:
package Caller api;
import Checksums library "Sha";
fn Process(Bytes data) {
...
var String digest = Checksums.Sha256.HexDigest(data);
...
}
In this example, the Sha256 namespace is exported as part of the API
implicitly.
The import keyword supports reusing code from other files and libraries. The
import keyword's syntax may be loosely expressed as a regular expression:
import IDENTIFIER (library NAME_PATH)?;
An import declares a package entity named after the imported package, and makes
api-tagged entities from the imported library through it. The full name path
is a concatenation of the names of the package entity, any namespace entities
applied, and the final entity addressed. Child namespaces or entities may be
aliased if desired.
For example, given a library:
package Math api;
namespace Trigonometry;
api fn Trigonometry.Sin(...);
Calling code would import it and use it like:
package Geometry api;
import Math;
fn DoSomething() {
...
Math.Trigonometry.Sin(...);
...
}
Repeat imports from the same package reuse the same package entity. For example,
this produces only one Math package entity:
import Math;
import Math library "Trigonometry";
Entities defined in the current file may be used without mentioning the package prefix. However, other symbols from the package must be imported and accessed through the package namespace just like symbols from any other package.
For example:
package Geometry api;
// This is required even though it's still in the Geometry package.
import Geometry library "Shapes";
// Circle must be referenced using the Geometry namespace of the import.
fn GetArea(Geometry.Circle c) { ... }
Namespaces offer named paths for entities. Namespaces may be nested. Multiple
libraries may contribute to the same namespace. In practice, packages may have
namespaces such as Testing containing entities that benefit from an isolated
space but are present in many libraries.
The namespace keyword's syntax may loosely be expressed as a regular
expression:
namespace NAME_PATH;
The namespace keyword declares a namespace entity. The namespace is applied to
other entities by including it as a prefix when declaring a name. For example:
package Time;
namespace Timezones.Internal;
struct Timezones.Internal.RawData { ... }
fn ParseData(Timezones.Internal.RawData data);
A namespace declaration adds the first identifier in the name path as a name in
the file's namespace. In the above example, after declaring
namespace Timezones.Internal;, Timezones is available as an identifier and
Internal is reached through Timezones.
Namespaces may exist on imported package entities, in addition to being declared in the current file. However, even if the namespace already exists in an imported library from the current package, the namespace must still be declared locally in order to add symbols to it.
For example, if the Geometry//Shapes/ThreeSides library provides the
Geometry.Shapes namespace, this code is still valid:
package Geometry library "Shapes/FourSides" api;
import Geometry library "Shapes/ThreeSides";
// This does not conflict with the existence of `Geometry.Shapes` from
// `Geometry//Shapes/ThreeSides`, even though the name path is identical.
namespace Shapes;
// This requires the above 'namespace Shapes' declaration. It cannot use
// `Geometry.Shapes` from `Geometry//Shapes/ThreeSides`.
struct Shapes.Square { ... };
Carbon's alias keyword will support aliasing namespaces. For example, this would be valid code:
namespace Timezones.Internal;
alias TI = Timezones.internal;
struct TI.RawData { ... }
fn ParseData(TI.RawData data);
Library name conflicts should not occur, because it's expected that a given package is maintained by a single organization. It's the responsibility of that organization to maintain unique library names within their package.
A package name conflict occurs when two different packages use the same name,
such as two packages named Stats. Versus libraries, package name conflicts are
more likely because two organizations may independently choose identical names.
We will encourage a unique package naming scheme, such as maintaining a name
server for open source packages. Conflicts can also be addressed by renaming one
of the packages, either at the source, or as a local modification.
We do need to address the case of package names conflicting with other entity
names. It's possible that a pre-existing api entity will conflict with a new
import, and that the api is infeasible to rename due to existing callers.
Alternately, the api entity may be using an idiomatic name that it would
contradict naming conventions to rename. In either case, this conflict may exist
in a single file without otherwise affecting users of the API. This will be
addressed by name lookup.
These are potential refactorings that we consider important to make it easy to automate.
Imports will frequently need to be updated as part of refactorings.
When code is deleted, it should be possible to parse the remaining code, parse the imports, and determine which entities in imports are referred to. Unused imports can then be removed.
When code is moved, it's similar to deletion in the originating file. For the destination file, the moved code should be parsed to determine which entities it referred to from the originating file's imports, and these will need to be included in the destination file: either reused if already present, or added.
When new code is added, existing imports can be checked to see if they provide the symbol in question. There may also be heuristics which can be implemented to check build dependencies for where imports should be added from, such as a database of possible entities and their libraries. However, adding references may require manually adding imports.
api and impl filesMove an implementation of an API from an api file to an impl file, while
leaving a declaration behind.
Split an api and impl file.
Move an implementation of an API from an impl file to an api file.
Combine an api and impl file.
Remove the api label from a declaration.
Add the api label to a declaration.
Move a non-api-labeled declaration from an api file to an impl file.
impl file that now contains
it. Search for other callers within the library, and fix them first.Move a non-api-labeled declaration from an impl file to an api file.
Move a declaration and implementation from one impl file to another.
impl file, and either move
them too, or fix them first.Rename a package.
Move an api-labeled declaration and implementation between different
packages.
Move an api-labeled declaration and implementation between libraries in
the same package.
Rename a library.
api-labeled
declaration and implementation between libraries in the same package.Move a declaration and implementation from one namespace to another.
Rename a namespace.
Rename a file, or move a file between directories.
We expect that most code should use a package and library, but avoid specifying namespaces beneath the package. The package name itself should typically be sufficient distinction for names.
Child namespaces create longer names, which engineers will dislike typing. Based on experience, we expect to start seeing aliasing even at name lengths around six characters long. With longer names, we should expect more aliasing, which in turn will reduce code readability because more types will have local names.
We believe it's feasible for even large projects to collapse namespaces down to a top level, avoiding internal tiers of namespaces.
We understand that child namespaces are sometimes helpful, and will robustly support them for that. However, we will model code organization to encourage fewer namespaces.
We use a few possibly redundant markers for packages and libraries:
package keyword requires one of api and impl, rather than
excluding either or both.api versus impl choice.import keyword requires the full library.These choices are made to assist human readability and tooling:
api versus impl makes it easier for both humans and
tooling to determine what to expect.These open questions are expected to be revisited by future proposals.
Currently, we're using .carbon and .impl.carbon. In the future, we may want
to change the extension, particularly because Carbon may be renamed.
There are several other possible extensions / commands that we've considered in coming to the current extension:
.carbon: This is an obvious and unsurprising choice, but also quite long
for a file extension..6c: This sounds a little like 'sexy' when read aloud..c6: This seems a weird incorrect ordering of the atomic number and has a
bad, if obscure, Internet slang association..cb or .cbn: These collide with several acronyms and may not be
especially memorable as referring to Carbon..crb: This has a bad Internet slang association.Currently, we do not support cross-language imports. In the future, we will likely want to support imports from other languages, particularly for C++ interoperability.
Although we're not designing this right now, it could fit into the proposed syntax. For example:
import Cpp file("myproject/myclass.h");
fn MyCarbonCall(Cpp.MyProject.MyClass x);
Currently, we don't support any kind of package management with imports. In the future, we may want to support tagging imports with a URL that identifies the repository where that package can be found. This can be used to help drive package management tooling and to support providing a non-name identity for a package that is used to enable handling conflicted package names.
Although we're not designing this right now, it could fit into the proposed syntax. For example:
import Carbon library "Utilities"
url("https://github.com/carbon-language/carbon-libraries");
Similar to api and impl, we may eventually want a type like test. This
should be part of a larger testing plan.
Right now, we only allow a single identifier for the package name. We could allow a full name path without changing syntax.
Advantages:
Database.Client and Database.Server.carbon-language/carbon-lang could become carbon_language.carbon_lang.Disadvantages:
winapi-build and winapi-util.At present, we are choosing to use single-identifier package names because of the lack of clear advantage towards a more complex name path.
packageRight now, we plan to refer to the package containing the current file by name.
What's important in the below example is the use of Math.Stats:
package Math library "Stats" api;
api struct Stats { ... }
struct Quantiles {
fn Stats();
fn Build() {
...
var Math.Stats b;
...
}
}
We could instead use package as an identifier within the file to refer to the
package, giving package.Stats.
It's important to consider how this behaves for impl files, which expect an
implicit import of the API. In other words, for impl files, this can be
compared to an implicit import Math; versus an implicit
import Math as package;. However, there may also be explicit imports from
the package, such as import Math library "Trigonometry";, which may or may not
be referable to using package, depending on the precise option used.
Advantages:
DateTime,
package.DateTime is unambiguous, whereas DateTime.DateTime could be
confusing.Disadvantages:
package keyword shifts the balance for long
package names by placing less burden on the package owner.package keyword with a significantly different meaning,
changing from a prefix for the required declaration at the top of the file,
to an identifier within the file.
DateTime in the package
DateTime.library keyword has been suggested to address concerns
with package. Given that library is an argument to package, it
does not significantly change the con.package Math; import Geometry;, and imports from the current package, such
as package Math; import Math library "Stats";.
package to be used to refer to all imports from
Math, including the current file. This gives consistent treatment for
the Math package, but not for other imports. In other words,
developers will always write package.Stats from within Math, and
Math.Stats will only be written in other packages.package be used for the current library's entities,
but not other imports. This gives consistent treatment for imports, but
not for the Math package as a whole. In other words, developers will
only write package.Stats when referring to the current library,
whether in api or impl files. Math.Stats will be used elsewhere,
including from within the Math package.package or the full package name to refer to
the current package. This allows code to say either package or Math,
with no enforcement for consistency. In other words, both
package.Stats and Math.Stats are valid within the Math package.Because name lookup can be expected to address the underlying issue differently, we will not add a feature to support name lookup. We also don't want package owners to name their packages things that even they find difficult to type. As part of pushing library authors to consider how their package will be used, we require them to specify the package by name where desired.
library keyword from package and importRight now, we have syntax such as:
package Math library "Median" api;
package Math library "Median" namespace Stats api;
import Math library "Median";
We could remove library, resulting in:
package Math.Median api;
package Math.Median namespace Math.Stats api;
import Math.Median;
Advantages:
Disadvantages:
package Math.Median namespace Math.Stats, could instead use
Stats, or this.Stats to elide the package name.Math.Median, with
namespace names, such as Math.Stats.package Math.Median api; uses the Math namespace,
the presence of Median with the same namespace syntax obfuscates the
actual namespace.package Math.Median namespace Math api is necessary to
use the Math namespace, requiring the namespace keyword makes it
difficult to put multiple libraries in the top-level namespace.As part of avoiding confusion between libraries and namespaces, we are declining this alternative.
In other languages, a "package" is equivalent to what we call the name path
here, which includes the namespace. We may want to rename the package
keyword to avoid conflicts in meaning.
Alternative names could be 'bundle', 'universe', or something similar to Rust's 'crates'; perhaps 'compound' or 'molecule'.
Advantages:
Disadvantages:
package also overlaps a fair amount, and we would lose that
context.
Several languages create a strict association between the method for pulling in an API and the path to the file that provides it. For example:
#include refers to specific files without any abstraction.
#include "PATH/TO/FILE.h" means there's a file
PATH/TO/FILE.h.package and import both reflect file system structure.
import PATH.TO.FILE; means there's a file
PATH/TO/FILE.java.import requires matching file system structure.
import PATH.TO.FILE means there's a file
PATH/TO/FILE.py.import refers to specific files.
import {...} from 'PATH/TO/FILE'; means there's a file
PATH/TO/FILE.ts.For contrast:
package uses an arbitrary name.
import "PATH/TO/NAME" means there is a directory
PATH/TO that contains one or more files starting with package NAME.In Carbon, we are using a strict association to say that
import PACKAGE library "PATH/TO/LIBRARY" means there is a file
PATH/TO/LIBRARY.carbon under some package root.
Advantages:
project.carbon and Project.carbon are
conflicting filenames. This is exacerbated by paths, wherein a file
config and a directory Config/ would conflict, even though this
would be a valid structure on Unix-based filesystems.Disadvantages:
package keyword by inferring related information
from the file system path.We are choosing to have some association between the file system path and
library for API files to make it easier to find a library's files. We are not
getting rid of the package keyword because we don't want to become dependent
on file system structures, particularly as it would increase the complexity of
distributed builds.
We propose to not allow exporting namespaces as part of library APIs. We could either allow or require exporting namespaces. For example:
package Checksums;
api namespace Sha256;
While this approach would mainly be syntactic, a more pragmatic use of this
would be in refactoring. It implies that an aliased namespace could be marked as
an api. For example, the below could be used to share an import's full
contents:
package Translator library "Interface" api;
import Translator library "Functions" as TranslatorFunctions;
api alias Functions = TranslatorFunctions;
Advantages:
api entities.Disadvantages:
api entities.
api that doesn't
contain any api entities.This alternative is declined because it's not sufficiently clear it'll be helpful, versus impairment of refactoring.
The current proposal is that implementation files in a library implicitly import their API, and that they cannot import other implementation files in the same library.
We could instead allow importing implementation files from within the same library. There are two ways this could be done:
We could add a syntax for importing symbols from other files in the same library. This would make it easy to identify a directed acyclic graph between files in the library. For example:
package Geometry;
import file("point.6c");
We could automatically detect when symbols from elsewhere in the library are referenced, given an import of the same library. For example:
package Geometry;
import this;
Advantages:
Disadvantages:
api and impl, particularly with a single
api, has been to mirror C++ .h and .cc. Wherein a .cc #include-ing
other .cc files is undesirable, allowing a impl to import another impl
could be considered similarly.The problems with these approaches, and encouragement towards small libraries, is how we reach the current approach of only importing APIs, and automatically.
Examples are using / to separator significant terms in library names, and //
to separate the package name in shorthand. For example,
package Time library "Timezones/Internal"; with shorthand
Time//Timezones/Internal.
Note that, because the library is an arbitrary string and shorthand is not a language semantic, this won't affect much. However, users should be expected to treat examples as best practice.
We could instead use . for library names and / for packages, such as
Time/Timezones.Internal.
Advantages:
Disadvantages:
People like /, so we're going with /.
We could stick to single word libraries in examples, such as replacing
library "Algorithms/Distance" with library "Distance".
Advantages:
Disadvantages:
We might list this as a best practice, and have Carbon only expose libraries following it. However, some hierarchy from users can be expected, and so it's worthwhile to include a couple examples to nudge users towards consistency.
We could remove the distinction between API and implementation files.
Advantages:
Disadvantages:
api/impl hierarchy gives a structure for compilation, if
there are multiple files we will likely need to provide a different
structure, perhaps explicit file imports, to indicate intra-library
compilation dependencies.
Requiring users to manage the api/impl split allows us to speed up
compilation for large codebases. This is important for large codebases, and
shouldn't directly affect small codebases that choose to only use api files.
We could try to address the problems with collapsing API and implementation files by automatically generating an API file from the input files for a library.
For example, it may preprocess files to split out an API, reducing the number of imports propagated for actual APIs. For example:
api declarations within the api file.Even under the proposed model, compilation will do some of this work as an
optimization. However, determining which imports are referenced requires
compilation of all imports that may be referenced. When multiple libraries are
imported from a single package, it will be ambiguous which imports are used
until all have been compiled. This will cause serialization of compilation that
can be avoided by having a developer split out the impl, either manually or
with developer tooling.
The impl files may make it easier to read code, but they will also allow for
better parallelism than api files alone can. This does not mean the compiler
will or will not add optimizations -- it only means that we cannot wholly rely
on optimizations by the compiler.
Automatically generating the API separation would only partly mitigate the serialization of compilation caused by collapsing file and library concepts. Most of the build performance impact would still be felt by large codebases, and so the mitigation does not significantly improve the alternative.
We could collapse the file and library concepts. What this implies is:
This has similar advantages and disadvantages to collapse API and implementation file concepts. Differences follow.
Advantages:
api files, while others will
always use impl files.Disadvantages:
As with collapse API and implementation file concepts, we consider the split to be important for large codebases. The additional advantages of a single-file restriction do not outweigh the disadvantages surrounding build performance.
We could only have packages, with no libraries. Some other languages do this; for example, in Node.JS, a package is often similar in size to what we currently call a library.
If packages became larger, that would lead to compile-time bottlenecks. Thus, if Carbon allowed large packages without library separation, we would undermine our goals for fast compilation. Even if we combined the concepts, we should expect it's by turning the "package with many small libraries" concept into "many small packages".
Advantages:
Disadvantages:
BoostGeometry and child libraries like
algorithms-distance under the proposed approach. Under the alternative
approach, it would use either a monolithic package that could create
compile-time bottlenecks, or packages like
BoostGeometryAlgorithmsDistance for uniqueness.We prefer to keep the library separation to enable better hierarchy for large codebases, plus encouraging small units of compilation. It's still possible for people to create small Carbon packages, without breaking it into multiple libraries.
Versus collapse the library concept into packages, we could have libraries without packages. Under this model, we still have libraries of similar granularity as what's proposed. However, there is no package grouping to them: there are only libraries which happen to share a namespace.
References to imports from other top-level namespaces would need to be prefixed
with a '.' in order to make it clear which symbols were from imports.
For example, suppose Boost is a large system that cannot be distributed to
users in a single package. As a result, Random functionality is in its own
distribution package, with multiple libraries contained. The difference between
approaches looks like:
package vs library:
package BoostRandom;library "Boost/Random" namespace Boost;package BoostRandom library "Uniform";library "Boost/Random.Uniform" namespace Boost;package BoostRandom namespace Distributions;library "Boost/Random.Uniform" namespace Boost.Random.Distributions;package BoostRandom library "Uniform" namespace Distributions;library "Boost/Random.Uniform" namespace Boost.Random.Distributions;import changes:
import BoostRandom;import "Boost/Random";import BoostRandom library "Uniform";import "Boost/Random.Uniform";import under both approaches.BoostRandom.UniformDistributionBoost.Random namespace: UniformBoost package but a different namespace:
Random.UniformBoost package: .Boost.Random.UniformWe assume that the compiler will enforce that the root namespace must either
match or be a prefix of the library name, followed by a / separator. For
example, Boost in the namespace Boost.Random.Uniform must either match a
library "Boost" or prefix as library "Boost/..."; library "BoostRandom"
does not match because it's missing the / separator.
There are several approaches which might remove this duplication, but each has been declined due to flaws:
library "Boost/Random.Uniform"; imply namespace Boost.
However, we want name paths to use things listed as identifiers in files. We
specifically do not want to use strings to generate identifiers in order to
support understandability of code.namespace Boost; syntax imply
library "Boost" namespace Boost;.
namespace and
other namespace keyword use. We could then rename the namespace
argument for library to something like file-namespace.namespace Boost.Random; does. It may
create library "Boost/Random" because library "Boost.Random" would
not be legal, but the change in characters may in turn lead to developer
confusion.
. instead of /
as a separator, but that may lead to broader confusion about the
difference between libraries and namespaces.Advantages:
import and package.. on imported name paths can help increase readability by
making it clear they're from imports, so long as those imports aren't from
the current top-level namespace.. optional for imports from the current top-level namespace
eliminates the boilerplate character when calling within the same library.Disadvantages:
. to mark absolute paths may conflict with other
important uses, such as designated initializers and named parameters.package BoostRandom library "Uniform";, they know installing a package
BoostRandom will give them the library. Declining this means that
users seeing library "Boost/Random.Uniform", they will still need to
do research as to what package contains Boost/Random.Uniform to figure
out how to install it because that package may not be named Boost.BoostRandom only adds to a namespace of the
same name. If a user is editing libraries in a package
BoostCustom, then BoostRandom may be treated as unmodifiable. An
IDE could optimize cache invalidation of BoostRandom at the
package level. As a result, if a user types BoostRandom. and
requests a tab completion, the system need only ensure that
libraries from the BoostRandom. package are loaded for an accurate
result.Boost.Random similarly adds to
the namespace Boost. However, if a user is editing libraries, the
IDE needs to support them adding to both Boost and MyProject
simultaneously. As a result, if a user types Boost. and requests a
tab completion, the system must have all libraries from all packages
loaded for an accurate result.library and namespace forces
duplication between both, which would otherwise be handled by package.namespace keyword.. on imported name paths will be repeated frequently through
code, increasing overall verbosity, versus the package approach which only
affects import verbosity.. optional for imports from the current top-level namespace
hides whether an API comes from the current library or an import.We are declining this approach because we desire package separation, and because
of concerns that this will lead to an overall increase in verbosity due to the
preference for few child namespaces,
whereas this alternative benefits when namespace is specified more often.
We're using api and impl for file types, and have test as an open
question.
We've considered using interface instead of api, but that introduces a
terminology collision with interfaces in the type system.
We've considered dropping api from naming, but that creates a definition from
absence of a keyword. It also would be more unusual if both impl and test
must be required, that api would be excluded. We prefer the more explicit
name.
We could spell out impl as implementation, but are choosing the abbreviation
for ease of typing. We also don't think it's an unclear abbreviation.
We expect impl to be used for implementations of interface. This isn't quite
as bad as if we used interface instead of api because of the api export
syntax on entities, such as api fn DoSomething(), which could create
ambiguities as interface fn DoSomething(). It may still confuse people to see
an interface impl in an api file. However, we're touching on related
concepts and don't see a great alternative.
We could consider more function-like syntax for import, and possibly also
package.
For example, instead of:
import Math library "Stats";
import Algebra as A;
We could do:
import("Math", "Stats").Math;
alias A = import("Algebra").Algebra;
Or some related variation.
Advantages:
alias for language consistency.library.
Disadvantages:
.Math and .Algebra above. However, this complicates the
resulting syntax.The preference is for keywords.
An implicit reason for keeping code in an api file is that it makes it
straightforward to inline code from there into callers.
We could explicitly encourage inlining from impl files as well, making the
location of code unimportant during compilation. Alternately, we could add an
inline file type which explicitly supports separation of inline code from the
api file.
Advantages:
Disadvantages:
impl files to determine what can be inlined from
the api file, leading to the transitive closure dependency problems which
impl files are intended to avoid.We expect to only support inlining from api files in order to avoid confusion
about dependency problems.
We currently have no special syntax for library-private APIs. However,
non-exported APIs are essentially library-private, and may be in the api file.
It's been suggested that we could either provide a special syntax or a new file
type, such as shared_impl, to support library-private APIs.
Advantages:
Disadvantages:
api file, the dependencies are still in the
transitive closure of client libraries, and any separation may confuse
users about the downsides of the extra dependencies.impl files, then they could be in the impl file if
there's only one, or shared from a separate library.At this point in time, we prefer not to provide specialized access controls for library-private APIs.
At present, we plan to have api versus impl as a file type, and also
.carbon versus .impl.carbon as the file extension. We chose to use both
together, rather than one or the other, because we expect some parties to
strongly want file content to be sufficient for compilation, while others will
want file extensions to be meaningful for the syntax split.
Instead of the file type split, we could drift further and instead have APIs in any file in a library, using the same kind of API markup.
Advantages:
Disadvantages:
The proposal also presently suggests a single API file. Under an explicit API file approach, we could still allow multiple API files.
Advantages:
Disadvantages:
We particularly want to discourage large libraries, and so we're likely to retain the single API file limit.
We're proposing strings for library names. We've discussed also using name paths
(My.Library) and also restricting to single identifiers (Library).
Advantages:
Disadvantages:
We've decided to use strings primarily because we want to draw the distinction that a library is not something that's used when referring to an entity in code.
Rather than requiring an import keyword per line, we could support block
imports, as can be found in languages like Go.
In other words, instead of:
import Math;
import Geometry;
We could have:
imports {
Math,
Geometry,
}
Advantages:
Disadvantages:
grep.One concern has been that a mix of import and imports syntax would be
confusing to users: we should only allow one.
This alternative has been declined because retyping import statements is
low-cost, and grep is useful.
We could allow block imports of libraries from the same package. For example:
import Containers libraries({
"FlatHashMap",
"FlatHashSet",
})
The result of this api alias allowing Containers.HashSet() to work
regardless of whether HashSet is in "HashContainers" or "Internal" may be
clearer if both import Containers statements were a combined
import Containers libraries({"HashContainers", "Internal"});.
The advantages/disadvantages are similar to block imports. Additional advantages/disadvantages are:
Advantages:
alias of the package
Containers is easier to understand as affecting all libraries.Disadvantages:
library and libraries syntax, it's two was of doing the
same thing.
libraries, removing library,
but that diverges from package's library syntax.This alternative has been declined for similar reasons to block imports; the additional advantages/disadvantages don't substantially shift the cost-benefit argument.
Carbon imports require specifying individual names to import. We could support
broader imports, for example by pulling in all names from a library. In C++, the
#include preprocessor directive even supports inclusion of arbitrary code. For
example:
import Geometry library "Shapes" names *;
// Triangle was imported as part of "*".
fn Draw(Triangle x) { ... }
Advantages:
Disadvantages:
We particularly value the parser benefits of knowing which identifiers are being imported, and so we require individual names for imports.
We could allow direct imports of names from libraries. For example, under the current setup we might see:
import Math library "Stats";
alias Median = Stats.Median;
alias Mean = Stats.Mean;
We could simplify this syntax by augmenting import:
import Math library "Stats" name Median;
import Math library "Stats" name Mean;
Or more succinctly with block imports of names:
import Math library "Stats" names {
Median,
Mean,
}
Advantages:
alias step.Disadvantages:
We could allow a short syntax for imports from the current library. For example,
this code imports Geometry.Shapes:
package Geometry library "Operations" api;
import library "Shapes";
Advantages:
Disadvantages:
grep.import Geometry library "Shapes" from within
Geometry, then we end up with a different inconsistency.Overall, consistent with the decision to disallow block imports, we are choosing to require the package name.
We are providing entity-level namespaces. This is likely necessary to support migrating C++ code, at a minimum. It's been discussed whether we should also support file-level namespaces.
For example, this is the current syntax for defining Geometry.Shapes.Circle:
package Geometry library "Shapes" api;
namespace Shapes;
struct Shapes.Circle;
This is the proposed alternative syntax for defining Geometry.Shapes.Circle,
and would put all entities in the file under the Shapes namespace:
package Geometry library "Shapes" namespace Shapes api;
struct Circle;
Advantages:
Disadvantages:
package keyword.namespace keyword
in multiple different ways.
We are choosing not to provide this for now because we want to provide the minimum necessary support, and then see if it works out. It may be added later, but it's easier to add features than to remove them.
Instead of including additional namespace information per-name, we could have scoped namespaces, similar to C++. For example:
namespace absl {
namespace numbers_internal {
fn SafeStrto32Base(...) { ... }
}
fn SimpleAtoi(...) {
...
return numbers_internal.SafeStrto32Base(...);
...
}
}
Advantages:
Disadvantages:
There are other ways to address the con, such as adding syntax to indicate the end of a namespace, similar to block comments. For example:
{ namespace absl
{ namespace numbers_internal
fn SafeStrto32Base(...) { ... }
} namespace numbers_internal
fn SimpleAtoi(...) {
...
return numbers_internal.SafeStrto32Base(...);
...
}
} namespace absl
While we could consider such alternative approaches, we believe the proposed contextless namespace approach is better, as it reduces information that developers will need to remember when reading/writing code.