carbon compilation command set to use
compile, link, and build.Core, but support it in a general
manner.The current command line is still a prototype, and lacks support for regular use. For example:
carbon compile produces one object file per input file. When
--output-file is specified and there are multiple inputs, the output is
repeatedly overwritten.carbon compile doesn't provide a trivial way to produce object files for
the prelude. The carbon_binary rule is, behind the scenes, separately
compiling all the prelude files individually and doing its own custom
linking with those.carbon compile and carbon link must be used in combination.Essentially, we have a decent setup for testing, but not one that's easy to use in real-world situations.
In C++, clang++ main.cpp -o program is a way to produce program. This is
trying to reach a similar goal to make it easy to build and test small programs.
Key commands related to this proposal are carbon compile, carbon clang, and
carbon link. The end result will likely compose multiple command elements in
order to build the output.
Note the goal here is to align on look-and-feel of separate compilation.
Although the carbon CLI is important to the language, most details aren't
necessary to address through the proposal process. For example, we want to get
flag names right here, but also we wouldn't expect a proposal for flag name
changes.
This is a proposal for the command line. Bazel rules are mentioned because it can help illustrate interactions with build systems. However, this proposal is not intended to decide Bazel design, and the existing Bazel rules have not been through the proposal process.
Restructure compilation into:
carbon compile: Take a single input to build, and produce a single output
.o.carbon build: Take multiple inputs in order to produce a linked binary.
carbon compile and carbon link.These are intended to accept flexible inputs:
carbon build in particular, it should not be necessary to pass in
Core files that are required.
Core and
directory structure. For example, prelude/types
maps to
core/prelude/types.carbon.At the end, it should be possible to:
carbon build program.carbon with non-prelude Core imports, and get
an executable program.Have Bazel rules that mix C++ code and Carbon code. For example:
carbon_library(
name = "foo",
srcs = ["foo.cpp", "foo.impl.carbon"],
apis = ["foo.carbon"],
)
carbon_binary(
name = "bar",
srcs = ["main.cpp"],
deps = [":carbon_library"],
)
The carbon compile command is intended to be a straightforward single input,
single output command. Dependencies will be provided through a combination of:
<filename>.carbon will default to
<filename>.o (including .impl.carbon becoming .impl.o).As part of supporting a mix of C++ and Carbon files, we will support
carbon compile foo.cpp with results similar to carbon clang -- -c foo.cpp.
The carbon build command will be the new, simple way to compile, as a
replacement for carbon compile. It will:
Core, add all .carbon
files as inputs.
Core, we expect .o files to be produced in the same way as for
carbon link.carbon compile
invocations.
carbon compile invocations.carbon link over produced inputs.While the build command will default to providing an executable program, we may
also want it to be capable of producing .a and .so files. However, we can
decide whether carbon build should be required for these kinds of outputs as
an implementation detail.
The carbon link command will change to make the following work:
carbon compile foo.carbon -o foo.o
carbon link foo.o -o program
It will be typical to link multiple object files into a single output file. The
output file flag will be optional, defaulting to program, possibly with a
target-specific extension; for example, program.exe for Windows.
This requires that Core files (not just the prelude) will have been compiled,
so that their object files can be included in output. It's expected that this
will be provided through on-demand runtimes. It should be possible to opt out of
including these, for example so that the Bazel carbon_binary rule can use
carbon link while also providing its own Core object files. However, it
should be on-by-default.
When we need a file for a packaging directive:
package Core ... could correspond to lib/carbon/core/.....carbon. For example, package Core library "prelude/types"; could
correspond to lib/carbon/core/prelude/types.carbon.
default.carbon. For example,
package Core; could correspond to lib/carbon/core/default.carbon.Suppose we have some command line carbon compile a.carbon, and in a.carbon,
it does import Core library "map";. This needs to load core/map.carbon, and
without parsing every file matching core/**/*.carbon.
In order to achieve this:
compile command will have a built-in directory mapping for the Core
package, for example to /usr/share/carbon/core (when installed to the
/usr prefix).map library name will need to match the filename, so
/usr/share/carbon/core/map.carbon.
map.carbon has other Core imports, they will be recursively loaded
once parsed.
We never need to map impl files by library name to a filename, or the other
way around; they cannot be discovered through an import, and we always need to
parse them in order to discover their imports. As a consequence, there is no
need to define rules mapping libraries to .impl.carbon files.
Because we'll build this for Core, it would probably be straightforward to
expose this for other packages, too. So for example, we could support
--package-path=MyPackage:/my/package for getting API files. However, that is
secondary to the Core behavior, so any support may become more of an
implementation detail for what makes sense.
For imports which rely on the implicit mapping (not in general), we will
disallow ambiguous library names. This includes an explicit library "default"
string name, which can be ambiguous with the implicit default library (both
would map to default.carbon).
The Bazel build rules will expose carbon compile and carbon link behaviors
in a slightly more Bazel-idiomatic way. For example, given:
carbon_library(
name = "lib",
srcs = ["a.impl.carbon", "b.impl.carbon", "b.carbon"],
apis = ["a.carbon"],
)
carbon_binary(
name = "bin",
srcs = ["main.carbon"],
deps = [":lib"],
)
The way this will approximately work is:
carbon_library will have an implicit dependency on a set of Core
libraries (such as a build target //carbon/lang:core).
carbon_library rules, some of which may
look like lib.lib:
carbon compile four times, producing a .o file for each
input.impl file compilations.bin:
lib.
deps means a.carbon and b.carbon will be additional
inputs, but it should ideally be an error if b.carbon is imported
directly. This is required because a.carbon can expose b.carbon
on the import boundary, meaning an indirect import of b.carbon
must work.It's possible that we may use carbon build where carbon compile is
mentioned, but if so, it should not make a significant difference in the
user-visible behavior.
For both, there should be an implicit dependency on the full Core package, not just the prelude. This is because we want the Core package to be easy to access.
The apis attribute is suggested to support only direct dependencies. For
example:
carbon_library(
name = "a",
apis = ["a.carbon"],
)
carbon_library(
name = "b",
apis = ["b.carbon"],
deps = [":a"],
)
carbon_library(
name = "c",
srcs = ["c.carbon"],
deps = [":b"],
)
If c.carbon imports a.carbon, the build should error that a.carbon
requires a direct dependency. We should allow forwarding, so that the same could
compile without requiring c to have a direct dependency on a. This should
look like exports = [":a"], added to b (and superseding the need to list
:a in deps).
This feature may see frequent use, for example in Core to allow writing it as
multiple libraries instead of one large glob. But it's probably also something
that can be delayed a little, because we can just use a big glob and force
direct dependencies.
In the core/ directory, we will set up corresponding carbon_library rules.
These will need to pass flags to opt-out of normal behaviors, in particular the
dependency on the prelude library.
As designed, every time any of the build, compile, or link commands are
used, all prelude files and possibly more of the Core package will be
re-checked, along with C++ ASTs being reproduced.
Instead, Carbon could serialize checked IR, store produced C++ ASTs, and so on. C++ ASTs in particular could be substantially constructed based on parsed Carbon state, rather than checked Carbon state, allowing more build parallelism. In distributed or cached build systems, being able to reuse portions of the build may increase performance.
The specific build outputs we want to store may substantially affect how we would set up a build process. The absence of a decision may lead to the implementation diverging from what's actually needed, meaning parts will be reimplemented later. This isn't expected to be too high cost.
There are also ways to improve build performance without taking these steps. Clang modules might be used for improving Clang compile performance without significant support from Carbon.
For now we will rely on whatever caching Bazel does for the .a output of a
carbon_library. No other outputs will be made available. That may change, but
leads want to spend our limited development and review time on other features
for the 0.1 milestone.
carbon build should support easy experimentation with Carbon, and also
small projects.clang can typically be replaced
with carbon clang, linking a binary becomes carbon link, and so on.carbon_library and carbon_binary are important to us for
Bazel support and a migration from cc_library and cc_binary.For carbon compile and carbon build, this is trying to split apart concepts.
Some considered alternatives are:
compile, and possibly also link, into build. Flags could be used
to differentiate between the versions desired, rather than subcommand names.
carbon build produce a.out
a.out is the default output of most C++ compilers, but it reflects a
legacy executable file format. Using the legacy name may reflect
backwards compatibility that Carbon doesn't plan.The build command as proposed here is intended to be sufficient for quick
testing and simple tools. However, it's not intended to be flexible with custom
rules, plugins, and so on. These are features offered by systems such as CMake
or Bazel.
Instead, we could provide a full build system. Multiple other languages have gone in that direction:
cargo combines a
build system
and package manager.cargo.Carbon's project goal is migration of existing C++ developers, particularly "This means integrating into the existing C++ ecosystem by supporting incremental migration from C++ to Carbon."
The expectation is that C++ users will already be using a fully featured build system, such as CMake. Migration should be easier if users can retain their existing build system, particularly since a typical migration can be expected to mix both Carbon and C++ code.
While Carbon could provide both a separate compilation system and a fully featured build system, a build system is a substantial undertaking and we expect C++ developers to already have one.
Instead of making a mapping from packaging directives to filenames, we could
generate a list specific to the Core package, and not expose that for other
packages.
We shouldn't manually maintain a mapping for the Core package; it should be
automated. It's likely that whatever we do in this space, however we would
support a mapping, would be of interest to small projects. It will probably be
low cost for us to build support for things other than Core, so we should just
do that.
Instead of building object files for Core on demand, we could distribute them
as part of Carbon. The upside of this is it would make builds a little faster;
the downside is that we'd end up in more of a situation where supported target
platforms were enumerated, or perhaps where special platforms could be built
on-demand in a bespoke manner.
We can probably add limited caching where it'd help, and support all platforms using similar logic that way with little performance penalty.
The current package and library directive design means a given api file
may have 0 or more impl files.
We could make it clear from the declaration in an api file what impl files
exist. This would require a split to describe the possible situations. For
example:
library "foo";: The common case of 1 impl file.library "foo" api_only;: Add a single keyword that indicates this is a
library with no impl file.library "foo" multi_impl 3;: Indicates this is an unusual library with 3
impl files.
a.impl.carbon,
a.1.impl.carbon, a.2.impl.carbon), but even knowing how many exist
would allow compiles to do validation. If we didn't do this, then it may
be equivalent to not require specifying the number of impl files (in
the example, multi_impl; instead of multi_impl 3;).Some advantages are:
api file,
then even if we find an impl file that is missing the definition we
don't know if there's another impl file that contains the definition.
With this feature, we could diagnose while compiling the common 0 or 1
impl file cases.impl files, which can indicate a
developer mistake in the build.impl filenames were constrained to be numbered, we could:
impl filenames.\.\d+.Some disadvantages are:
impl files.This has been discussed in the past, but does not seem to be outlined in any proposals as a considered alternative, and this proposal adds new trade-offs for file mappings. Leads have declined this option in order to keep packaging directives simple.