C++ code in the Carbon project should use a consistent and well documented style guide. Where possible, this should be enacted and enforced with tooling to avoid toil both for authors of C++ code in the Carbon project and for code reviewers.
However, we are not in the business of innovating significantly in the space of writing clean and maintainable C++ code, and so we work primarily to reuse existing best practices and guidelines.
The baseline style guidance is the Google C++ style guide.
We provide some local guidance beyond the baseline. These are typically motived either by specific value provided to the project, or to give simpler and more strict guidance for Carbon's narrow use of C++.
Carbon's C++ code tries to match the proposed Carbon naming convention as closely as is reasonable in C++ in order to better understand and familiarize ourselves with the practice of using this convention. It happens that this is fairly similar to the naming convention in the Google style guide and largely serves to simplify it.
UpperCamelCase, referencing Proper
Nouns.
constexpr variables,
enumerators, etc.UpperCamelCase. The distinction between a virtual function and a
non-virtual function should be invisible, especially at the call site,
as that is an internal implementation detail. We want to be able to
freely change that without updating the name.snake_case names if they do nothing besides
return a reference to a data member (or assign a value to a data member, in
the case of set_ methods), or if their behavior (including
performance) would be unsurprising to a caller who assumes they are
implemented that way.snake_case, including function parameters, and
non-constant local and member variables.
_.Api instead of API).
LLVM and IR, which we capitalize.snake_case for files, directories, and build system rules.
Avoid -s in these as well..cpp for source files, which is the most common open source extension
and matches other places where "C++" is written without punctuation.These are minor issues where any of the options would be fine and we simply need
to pick a consistent option. Where possible,
clang-format should be used to enforce
these.
-> void, for consistency with Carbon syntax.* adjacent to the type: TypeName* variable_name.const before the type when at the outer level: const int N = 42;.Only use line comments (with //, not /* ... */), on a line by
themselves, except for
argument name comments,
closing namespace comments,
and similar structural comments. In particular, don't append comments about
a line of code to the end of its line:
int bad = 42; // Don't comment here.
// Instead comment here.
int good = 42;
// Closing namespace comments are structural, and both okay and expected.
} // namespace MyNamespace
This dogfoods our planned commenting syntax for Carbon. It also provides a single, consistent placement rule. It also provides more resilience against automated refactorings. Those changes often make code longer, which forces ever more difficult formatting decisions, and can easily spread one line across multiple lines, leaving it impossible to know where to place the comment. Comments on their own line preceding such code, while still imprecise, are at least less confusing over the course of such refactorings.
Use the using-based type alias syntax instead of typedef.
Don't use using to support unqualified lookup on std types; for example,
using std::vector;. This also applies to other short namespaces,
particularly llvm and clang.
std:: gives clearer diagnostics and avoids any possible
ambiguity, particularly for ADL.std::swap that are
intentionally called using ADL. This pattern should be written as
{ using std::swap; swap(thing1, thing2); }.For initialization:
=) when initializing directly with the intended
value (or with a braced initializer directly specifying that value).{.a = 1}) when possible for structs,
but not for pairs or tuples. Prefer to only include the typename
when required to compile (WizType{.a = 1}). This is analogous to
how structs and tuples would be written in Carbon code.llvm::SmallVector<int> v = {0, 1};), std::pair, or
std::tuple. Never use it with auto (auto a = {0, 1}).FooType foo(10);) in most other
cases.= (BarType bar{10}) should be treated
as a fallback, preferred only when other constructor syntax doesn't
compile.Always mark constructors explicit unless there's a specific reason to
support implicit or {} initialization.
When passing an object's address as an argument, use a reference unless one of the following cases applies:
If it is captured and must outlive the call expression itself, use a pointer and document that it must not be null (unless it is also optional).
When storing an object's address as a non-owned member, prefer storing a pointer. For example:
class Bar {
public:
// `foo` must not be null.
explicit Bar(Foo* foo) : foo_(foo) {}
private:
Foo* foo_;
};
Always use braces for conditional, switch, and loop statements, even when
the body is a single statement.
switch statement, use braces after a case label when
necessary to create a scope for a variable.For
internal linkage
of definitions of functions and variables, prefer static over anonymous
namespaces. static minimizes the context necessary to notice the internal
linkage of a definition.
For
Access Control,
specifically for test fixtures in .cpp files, we use public instead of
protected. This is motivated by the
misc-non-private-member-variables-in-classes tidy check.
autoWe generally use auto for most local variables when a type can be inferred,
except for primitive types such as bool and int. It is not required to use
auto, and shorter type names such as SemIR::InstId are sometimes named even
though they could be inferred. Naming the type can be helpful in cases where the
type would be obscure and can not be explained with the variable name. Function
parameters generally name the type of each parameter, though lambdas may use
auto if it's helpful.
When naming variables, we typically suffix _id for ID types. When needed, we
can also resolve ambiguity by referring to the full type name in the variable
name; for example, if there's a ClassId, InstId, and TypeId for the same
class entity, we might call these class_id, class_inst_id, and
class_type_id. Similarly, we might call an Inst class_inst.
constexpr..clang-format contentsSee this repository's .clang-format file.