|
|
hace 3 años | |
|---|---|---|
| .. | ||
| README.md | hace 3 años | |
| source_files.md | hace 4 años | |
Important Carbon goals for code and name organization are:
Tooling support is important for Carbon, including the possibility of a package manager.
Developer tooling, including both IDEs and refactoring tools, are expected to exist and be well-supported.
Software and language evolution:
We should support libraries adding new structs, functions or other identifiers without those new identifiers being able to shadow or break existing users that already have identifiers with conflicting names.
We should make it easy to refactor code, including moving code between files. This includes refactoring both by humans and by developer tooling.
Fast and scalable development:
It should be easy for developer tooling to parse code, without needing to parse imports for context.
Structure should be provided for large projects to opt into features which will help maintain scaling of their codebase, while not adding burdens to small projects that don't need it.
Carbon source files have a .carbon extension, such as
geometry.carbon. These files are the basic unit of compilation.
Each file begins with a declaration of which
package[define]
it belongs in. The package is the unit of distribution. The package name is a
single identifier, such as Geometry. An example API file in the Geometry
package would start with:
package Geometry api;
A tiny package may consist of a single library with a single file, and not use
any further features of the package keyword.
It is often useful to use separate files for the API and its implementation.
This may help organize code as a library grows, or to let the build system
distinguish between the dependencies of the API itself and its underlying
implementation. Implementation files allow for code to be extracted out from the
API file, while only being callable from other files within the library,
including both API and implementation files. Implementation files are marked by
both naming the file to use an extension of .impl.carbon and instead start
with:
package Geometry impl;
However, as a package adds more files, it will probably want to separate out
into multiple
libraries[define].
A library is the basic unit of dependency. Separating code into multiple
libraries can speed up the overall build while also making it clear which code
is being reused. For example, an API file adding the library Shapes to the
Geometry package, or Geometry//Shapes in
shorthand, would start with:
package Geometry library "Shapes" api;
As code becomes more complex, and users pull in more code, it may also be
helpful to add
namespaces[define]
to give related entities consistently structured names. A namespace affects the
name
path[define]
used when calling code. For example, with no namespace, if a Geometry package
defines Circle then the name path will be Geometry.Circle. However, it can
be named Geometry.TwoDimensional.Circle with a namespace; for example:
package Geometry library "Shapes" api;
namespace TwoDimensional;
struct TwoDimensional.Circle { ... };
This scaling of packages into libraries and namespaces is how Carbon supports both small and large codebases.
A different way to think of the sizing of packages and libraries is:
Abseil package.Boost,
with dependencies on other libraries in Boost and potentially other
packages from Boost.
library "Algorithms", or include part of the path to reduce the
chance of name collisions, such as library "Geometry/Algorithms".Packages may choose to expose libraries that expose unions of interfaces from other libraries within the package. However, doing so would also provide the transitive closure of build-time dependencies, and is likely to be discouraged in many cases.
The import keyword supports reusing code from other files and libraries.
For example, to use Geometry.Circle from the Geometry//Shapes library:
import Geometry library "Shapes";
fn Area(circle: Geometry.Circle) { ... };
The library keyword is optional for import, and its use should parallel that
of library on the package of the code being imported.
Every source file will consist of, in order:
package directive.import directives.Comments and blank lines may be intermingled with these sections. Metaprogramming code may also be intermingled, so long as the outputted code is consistent with the enforced ordering. Other types of code must be in the source file body.
Name paths are defined above as sequences of identifiers separated by dots. This syntax may be loosely expressed as a regular expression:
IDENTIFIER(\.IDENTIFIER)*
Name conflicts are addressed by name lookup.
package directivesThe package directive's syntax may be loosely expressed as a regular
expression:
package IDENTIFIER (library STRING)? (api|impl);
For example:
package Geometry library "Objects/FourSides" api;
Breaking this apart:
package keyword, Geometry, is the package
name and will prefix both library and namespace paths.
package keyword also declares a package entity matching the
package name. A package entity is almost identical to a namespace
entity, except with some package/import-specific handling. In other
words, if the file declares struct Line, that may be used from within
the file as both Line directly and Geometry.TwoDimensional.Line
using the Geometry package entity created by the package keyword.library keyword is specified, sets the name of the
library within the package. In this example, the
Geometry//Objects/FourSides library will be used.api keyword indicates this is an API files as described
under libraries. If it instead had impl, this would be an
implementation file.Because every file must have exactly one package directive, there are a couple
important and deliberate side-effects:
package Geometry api;. This could be considered equivalent to
package Geometry library "" api;, although we should not allow that
specific syntax as error-prone.package directive.Files contributing to the Geometry//Objects/FourSides library must all start
with package Geometry library "Objects/FourSides", but will differ on
api/impl types.
Library names may also be referred to as PACKAGE//LIBRARY as shorthand in
text. PACKAGE//default will refer to the name of the library used when no
library argument is specified, although PACKAGE may also be used in
situations where it is unambiguous that it still refers to the default library.
It's recommended that libraries use a single / for separators where desired,
in order to distinguish between the // of the package and / separating
library segments. For example, Geometry//Objects/FourSides uses a single /
to separate the Object/FourSides library name.
Because the package also declares a namespace entity with the same name, conflicts with the package name are possible. We do not support packages providing entities with the same name as the package.
For example, this is a conflict for DateTime:
package DateTime api;
struct DateTime { ... };
This declaration is important for implementation files, which implicitly import the library's API, because it keeps the package name as an explicit entity in source files.
Note that imported name conflicts are handled differently.
Every Carbon library consists of one or more files. Each Carbon library has a primary file that defines its API, and may optionally contain additional files that are implementation.
package directive will have api. For example,
package Geometry library "Shapes" api;
.carbon extension. They must not have a
.impl.carbon extension.package directive will have impl. For example,
package Geometry library "Shapes" impl;.
.impl.carbon extension.api imports.The difference between API and implementation will act as a form of access control. API files must compile independently of implementation, only importing from APIs from other libraries. API files are also visible to all files and libraries for import. Implementation files only see API files for import, not other implementation files.
When any file imports a library's API, it should be expected that the transitive closure of imported files from the primary API file will be a compilation dependency. The size of that transitive closure affects compilation time, so libraries with complex implementations should endeavor to minimize their API imports.
Libraries also serve as a critical unit of compilation. Dependencies between libraries must be clearly marked, and the resulting dependency graph will allow for separate compilation.
Entities in the API file are part of the library's public API by default. They
may be marked as private to indicate they should only be visible to other
parts of the library.
package Geometry library "Shapes" api;
// Circle is an API, and will be available to other libraries as
Geometry.Circle.
struct Circle { ... }
// CircleHelper is private, and so will not be available to other libraries.
private fn CircleHelper(circle: Circle) { ... }
// Only entities in namespaces should be marked as an API, not the namespace
// itself.
namespace Operations;
// Operations.GetCircumference is an API, and will be available to
// other libraries as Geometry.Operations.GetCircumference.
fn Operations.GetCircumference(circle: Circle) { ... }
This means that an API file can contain all implementation code for a library. However, separate implementation files are still desirable for a few reasons:
Entities in the impl file should never have visibility keywords. If they are
forward declared in the api file, they use the declaration's visibility; if
they are only present in the impl file, they are implicitly private.
The compilation graph of Carbon will generally consist of api files depending
on each other, and impl files depending only on api files. Compiling a given
file requires compiling the transitive closure of api files first.
Parallelization of compilation is then limited by how large that transitive
closure is, in terms of total volume of code rather than quantity. This also
affects build cache invalidation.
In order to maximize opportunities to improve compilation performance, we will
encourage granular libraries. Conceptually, we want libraries to be very small,
possibly containing only a single class. The choice of only allowing a single
api file per library should help encourage developers to write small
libraries.
Any entity may be marked with api except for namespace and package entities.
That is, api namespace Sha256; is invalid code. Instead, namespaces are
implicitly exported based on the name paths of other entities marked as api.
For example, given this code:
package Checksums library "Sha" api;
namespaces Sha256;
fn Sha256.HexDigest(data: Bytes) -> String { ... }
Calling code may look like:
package Caller api;
import Checksums library "Sha";
fn Process(data: Bytes) {
...
var digest: String = Checksums.Sha256.HexDigest(data);
...
}
In this example, the Sha256 namespace is exported as part of the API
implicitly.
import directives supports reusing code from other files and libraries. The
import directive's syntax may be loosely expressed as a regular expression:
import IDENTIFIER (library NAME_PATH)?;
An import declares a package entity named after the imported package, and makes API entities from the imported library available through it. The full name path is a concatenation of the names of the package entity, any namespace entities applied, and the final entity addressed. Child namespaces or entities may be aliased if desired.
For example, given a library:
package Math api;
namespace Trigonometry;
fn Trigonometry.Sin(...);
Calling code would import it and use it like:
package Geometry api;
import Math;
fn DoSomething() {
...
Math.Trigonometry.Sin(...);
...
}
Repeat imports from the same package reuse the same package entity. For example,
this produces only one Math package entity:
import Math;
import Math library "Trigonometry";
NOTE: A library must never import itself. Any impl files in a library
automatically import the api, so a self-import should never be required.
Entities defined in the current file may be used without mentioning the package prefix. However, other symbols from the package must be imported and accessed through the package namespace just like symbols from any other package.
For example:
package Geometry api;
// This is required even though it's still in the Geometry package.
import Geometry library "Shapes";
// Circle must be referenced using the Geometry namespace of the import.
fn GetArea(c: Geometry.Circle) { ... }
Namespaces offer named paths for entities. Namespaces may be nested. Multiple
libraries may contribute to the same namespace. In practice, packages may have
namespaces such as Testing containing entities that benefit from an isolated
space but are present in many libraries.
The namespace keyword's syntax may loosely be expressed as a regular
expression:
namespace NAME_PATH;
The namespace keyword declares a namespace entity. The namespace is applied to
other entities by including it as a prefix when declaring a name. For example:
package Time;
namespace Timezones.Internal;
struct Timezones.Internal.RawData { ... }
fn ParseData(data: Timezones.Internal.RawData);
A namespace declaration adds the first identifier in the name path as a name in
the file's namespace. In the above example, after declaring
namespace Timezones.Internal;, Timezones is available as an identifier and
Internal is reached through Timezones.
Namespaces may exist on imported package entities, in addition to being declared in the current file. However, even if the namespace already exists in an imported library from the current package, the namespace must still be declared locally in order to add symbols to it.
For example, if the Geometry//Shapes/ThreeSides library provides the
Geometry.Shapes namespace, this code is still valid:
package Geometry library "Shapes/FourSides" api;
import Geometry library "Shapes/ThreeSides";
// This does not conflict with the existence of `Geometry.Shapes` from
// `Geometry//Shapes/ThreeSides`, even though the name path is identical.
namespace Shapes;
// This requires the above 'namespace Shapes' declaration. It cannot use
// `Geometry.Shapes` from `Geometry//Shapes/ThreeSides`.
struct Shapes.Square { ... };
Carbon's alias keyword will support aliasing namespaces. For example, this would be valid code:
namespace Timezones.Internal;
alias TI = Timezones.internal;
struct TI.RawData { ... }
fn ParseData(data: TI.RawData);
Library name conflicts should not occur, because it's expected that a given package is maintained by a single organization. It's the responsibility of that organization to maintain unique library names within their package.
A package name conflict occurs when two different packages use the same name,
such as two packages named Stats. Versus libraries, package name conflicts are
more likely because two organizations may independently choose identical names.
We will encourage a unique package naming scheme, such as maintaining a name
server for open source packages. Conflicts can also be addressed by renaming one
of the packages, either at the source, or as a local modification.
We do need to address the case of package names conflicting with other entity names. It's possible that a preexisting entity will conflict with a new import, and that renaming the entity is infeasible to rename due to existing callers. Alternately, the entity may be using an idiomatic name that it would contradict naming conventions to rename. In either case, this conflict may exist in a single file without otherwise affecting users of the API. This will be addressed by name lookup.
These are potential refactorings that we consider important to make it easy to automate.
Imports will frequently need to be updated as part of refactorings.
When code is deleted, it should be possible to parse the remaining code, parse the imports, and determine which entities in imports are referred to. Unused imports can then be removed.
When code is moved, it's similar to deletion in the originating file. For the destination file, the moved code should be parsed to determine which entities it referred to from the originating file's imports, and these will need to be included in the destination file: either reused if already present, or added.
When new code is added, existing imports can be checked to see if they provide the symbol in question. There may also be heuristics which can be implemented to check build dependencies for where imports should be added from, such as a database of possible entities and their libraries. However, adding references may require manually adding imports.
api and impl filesMove an implementation of an API from an api file to an impl file, while
leaving a declaration behind.
Split an api and impl file.
Move an implementation of an API from an impl file to an api file.
Combine an api and impl file.
Remove the api label from a declaration.
Add the api label to a declaration.
Move a non-api-labeled declaration from an api file to an impl file.
impl file that now contains
it. Search for other callers within the library, and fix them first.Move a non-api-labeled declaration from an impl file to an api file.
Move a declaration and implementation from one impl file to another.
impl file, and either move
them too, or fix them first.Rename a package.
Move an api-labeled declaration and implementation between different
packages.
Move an api-labeled declaration and implementation between libraries in
the same package.
Rename a library.
api-labeled
declaration and implementation between libraries in the same package.Move a declaration and implementation from one namespace to another.
Rename a namespace.
Rename a file, or move a file between directories.
We expect that most code should use a package and library, but avoid specifying namespaces beneath the package. The package name itself should typically be sufficient distinction for names.
Child namespaces create longer names, which engineers will dislike typing. Based on experience, we expect to start seeing aliasing even at name lengths around six characters long. With longer names, we should expect more aliasing, which in turn will reduce code readability because more types will have local names.
We believe it's feasible for even large projects to collapse namespaces down to a top level, avoiding internal tiers of namespaces.
We understand that child namespaces are sometimes helpful, and will robustly support them for that. However, we will model code organization to encourage fewer namespaces.
We use a few possibly redundant markers for packages and libraries:
package keyword requires one of api and impl, rather than
excluding either or both.api versus impl choice.import keyword requires the full library.These choices are made to assist human readability and tooling:
api versus impl makes it easier for both humans and
tooling to determine what to expect.These open questions are expected to be revisited by future proposals.
Currently, we're using .carbon and .impl.carbon. In the future, we may want
to change the extension, particularly because Carbon may be renamed.
There are several other possible extensions / commands that we've considered in coming to the current extension:
.carbon: This is an obvious and unsurprising choice, but also quite long
for a file extension..6c: This sounds a little like 'sexy' when read aloud..c6: This seems a weird incorrect ordering of the atomic number and has a
bad, if obscure, Internet slang association..cb or .cbn: These collide with several acronyms and may not be
especially memorable as referring to Carbon..crb: This has a bad Internet slang association.Currently, we do not support cross-language imports. In the future, we will likely want to support imports from other languages, particularly for C++ interoperability.
To fit into the proposed import syntax, we are provisionally using a special
Cpp package to import headers from C++ code, as in:
import Cpp library "<map>";
import Cpp library "myproject/myclass.h";
fn MyCarbonCall(x: Cpp.std.map(Cpp.MyProject.MyClass));
Currently, we don't support any kind of package management with imports. In the future, we may want to support tagging imports with a URL that identifies the repository where that package can be found. This can be used to help drive package management tooling and to support providing a non-name identity for a package that is used to enable handling conflicted package names.
Although we're not designing this right now, it could fit into the proposed syntax. For example:
import Carbon library "Utilities"
url("https://github.com/carbon-language/carbon-libraries");
Similar to api and impl, we may eventually want a type like test. This
should be part of a larger testing plan.
api to privateimpl to public