Browse Source

C++ Interop: API importing and semantics (#6358)

This proposal defines the concrete technical mechanisms for C++
interoperability. It specifies the precise syntax and semantics for
importing
C++ APIs. This includes the `import Cpp library "..."` and implicitly
importing
C++ built-in entities, and the establishment of the `Cpp` package as the
dedicated namespace for all imported entities.

This PR also includes high level language C++ Interop design and the
basics of importing C++ APIs and function calling.
Leaving plenty of TODOs to make it easier to fill in more details in
followups.

Part of #4666.

---------

Co-authored-by: Richard Smith <richard@metafoo.co.uk>
Co-authored-by: Carbon Infra Bot <carbon-external-infra@google.com>
Boaz Brickner 2 months ago
parent
commit
9ff6b0a682
2 changed files with 324 additions and 1 deletions
  1. 184 1
      docs/design/interoperability/README.md
  2. 140 0
      proposals/p6358.md

+ 184 - 1
docs/design/interoperability/README.md

@@ -12,6 +12,26 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 -   [Philosophy and goals](#philosophy-and-goals)
 -   [Overview](#overview)
+-   [C++ interoperability model: introduction and principles](#c-interoperability-model-introduction-and-principles)
+    -   [The successor language mandate](#the-successor-language-mandate)
+    -   [The C++ interop type](#the-c-interop-type)
+-   [Importing C++ APIs into Carbon](#importing-c-apis-into-carbon)
+    -   [Importing C++ libraries (header-based)](#importing-c-libraries-header-based)
+    -   [TODO: Importing C++ code (inline)](#todo-importing-c-code-inline)
+    -   [Accessing built-in C++ entities (file-less)](#accessing-built-in-c-entities-file-less)
+    -   [The `Cpp` package](#the-cpp-package)
+    -   [TODO: Importing C++ macros](#todo-importing-c-macros)
+-   [Calling C++ code from Carbon](#calling-c-code-from-carbon)
+    -   [Function call syntax and semantics](#function-call-syntax-and-semantics)
+    -   [TODO: Overload resolution](#todo-overload-resolution)
+    -   [TODO: Constructors](#todo-constructors)
+    -   [TODO: Struct literals](#todo-struct-literals)
+-   [TODO: Accessing C++ classes, structs, and members](#todo-accessing-c-classes-structs-and-members)
+-   [TODO: Accessing global variables](#todo-accessing-global-variables)
+-   [TODO: Bi-directional type mapping: primitives and core types](#todo-bi-directional-type-mapping-primitives-and-core-types)
+-   [TODO: Advanced type mapping: pointers, references, and `const`](#todo-advanced-type-mapping-pointers-references-and-const)
+-   [TODO: Bi-directional type mapping: standard library types](#todo-bi-directional-type-mapping-standard-library-types)
+-   [TODO: The operator interoperability model](#todo-the-operator-interoperability-model)
 
 <!-- tocstop -->
 
@@ -29,4 +49,167 @@ more detail.
 
 ## Overview
 
-TODO
+Carbon's bidirectional interoperability with C++ is
+[a cornerstone of its design](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code),
+enabling a gradual transition from existing C++ codebases. The goal is not just
+a foreign function interface (FFI), but a seamless, high-fidelity integration
+that supports advanced C++ features, from templates to class hierarchies.
+
+C++ APIs are imported into Carbon using an `import Cpp` directive, which makes
+C++ declarations available within a dedicated `Cpp` package in Carbon. This
+prevents name collisions and makes the origin of symbols explicit. Carbon code
+can then call C++ functions, instantiate C++ classes, and use C++ types, while
+respecting C++'s semantics, including its complex overload resolution rules and
+preserving the nominal distinctions between C++ types like `long` and
+`long long`, or `T*` and `T&`, which is critical for correct overload resolution
+and template instantiation.
+
+Similarly, Carbon APIs can be designed to be callable from C++. The
+interoperability layer is designed to be zero-cost, avoiding unnecessary
+allocations or copies when calling between the two languages.
+
+## C++ interoperability model: introduction and principles
+
+### The successor language mandate
+
+The design of Carbon's C++ interoperability is governed by its foundational
+goal: [to be a successor language](/README.md), not merely a language with a
+foreign function interface (FFI). This mandate dictates a design that moves
+beyond the C-style FFI adopted by most modern languages and instead provides
+seamless, bidirectional interoperability. The objective is to support deep
+integration with existing C++ code, encompassing its most complex features, from
+inheritance to templates.
+
+This goal has profound implications for the Carbon compiler and language
+semantics. It requires that C++ is not treated as a foreign entity. Instead,
+Carbon's semantic model must be _co-designed_ to understand, map, and interact
+with C++'s semantic constructs—including templates, class hierarchies, and
+complex overload resolution—with high fidelity. The interoperability layer must,
+therefore, operate at the semantic analysis level, not just at the linking (ABI)
+level. This document specifies the design of this semantic contract.
+
+### The C++ interop type
+
+A core mechanism in this design is the C++ interop type. This concept defines
+the "trigger" that activates C++-specific semantic rules within the Carbon
+compiler. Any operation involving a type that is designated as a C++ interop
+type could invoke the specialized interoperability logic, such as C++ overload
+resolution or operator overload resolution that involves both Carbon and C++
+operator overloads.
+
+A type is considered a C++ interop type if its definition involves an imported
+C++ type in any of the following ways:
+
+1.  A C++ imported type (for example, `Cpp.Widget`).
+2.  A pointer to a C++ interop type (for example, `Cpp.Widget*`).
+3.  A Carbon generic type parameterized with a C++ interop type (for example,
+    `MyCarbonVector(Cpp.Widget)`).
+
+More generally, a C++ interop type is any type for which Carbon's
+[orphan rule](https://docs.carbon-lang.dev/docs/design/generics/details.html#orphan-rule)
+would allow an impl to be provided by a library in `package Cpp`.
+
+This "pervasive" model of C++-awareness is a fundamental design choice. The C++
+semantics are not confined to a specific `unsafe` or `extern "C++"` block; they
+affect any Carbon type that composes them. For example, when the Carbon compiler
+instantiates a _Carbon_ generic type like `MyCarbonVector(Cpp.Widget)`, its type
+system must be aware that the `Cpp.Widget` parameter carries C++-specific rules.
+This mandates that Carbon's own generic system, struct layout logic, overload
+resolution and operator lookup must query the type system for the presence of a
+C++ interop type. If present, Carbon must consider C++ rules when operating over
+C++ interop types. This design prioritizes the goal of a seamless and intuitive
+user experience.
+
+## Importing C++ APIs into Carbon
+
+### Importing C++ libraries (header-based)
+
+The primary mechanism for importing existing, user-defined C++ code is through
+header file inclusion. Carbon must be able to parse and analyze C++ header files
+to make their declarations available within Carbon.
+
+**Syntax:** The syntax for this operation is `import Cpp library "header_name"`.
+This syntax is used for both standard library headers and user-defined headers:
+
+-   **Standard Library:**
+
+    ```carbon
+    import Cpp library "<cstdio>";
+    ```
+
+    This import makes entities like `putchar` available.
+
+-   **C++ User-Defined Header:**
+    ```carbon
+    import Cpp library "circle.h";
+    ```
+    This import makes user-defined declarations and definitions available.
+
+### TODO: Importing C++ code (inline)
+
+### Accessing built-in C++ entities (file-less)
+
+Some C++ entities, particularly built-in primitive types, are not defined in any
+header file. They are "intrinsic" to the C++ language. These entities are
+available in Carbon without an explicit `import` declaration.
+
+### The `Cpp` package
+
+A critical design choice for managing C++ imports is the mandatory use of a
+containing package, `Cpp`. All imported C++ named entities (functions, types,
+namespaces) are contained in the `Cpp` package.
+
+-   **Functions:** `Cpp.putchar(...)`
+-   **Classes/Types:** `Cpp.Circle`, `Cpp.Point`
+-   **Constructors:** `Cpp.Circle.Circle()`
+
+The `Cpp.` prefix makes the _origin_ of every symbol explicit and unambiguous.
+It ensures that C++ entities cannot collide with Carbon code.
+
+### TODO: Importing C++ macros
+
+## Calling C++ code from Carbon
+
+### Function call syntax and semantics
+
+Once imported, C++ functions are invoked using standard Carbon function call
+syntax, prefixed with the `Cpp` name. The Carbon compiler is responsible for
+mapping the Carbon arguments to the types expected by the C++ function's
+signature.
+
+This often requires explicit casting on the Carbon side, using the `as` keyword,
+to satisfy the C++ function's parameter types.
+
+**Example:** The following example imports `cstdio` and calls the C function
+`putchar`. The Carbon `Core.Char` variable `n` must be cast first to `u8` and
+then to `i32` to match the `int` parameter expected by `putchar`.
+
+```carbon
+import Cpp library "<cstdio>";
+
+fn Run() {
+  let hello: array(Core.Char, 6) = ('H', 'e', 'l', 'l', 'o', '!');
+  for (n: Core.Char in hello) {
+    // Carbon 'as' casting is used to match the C++ signature
+    Cpp.putchar((n as u8) as i32);
+  }
+}
+```
+
+### TODO: Overload resolution
+
+### TODO: Constructors
+
+### TODO: Struct literals
+
+## TODO: Accessing C++ classes, structs, and members
+
+## TODO: Accessing global variables
+
+## TODO: Bi-directional type mapping: primitives and core types
+
+## TODO: Advanced type mapping: pointers, references, and `const`
+
+## TODO: Bi-directional type mapping: standard library types
+
+## TODO: The operator interoperability model

+ 140 - 0
proposals/p6358.md

@@ -0,0 +1,140 @@
+# C++ Interop: API importing and semantics
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/6358)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Abstract](#abstract)
+-   [Problem](#problem)
+-   [Background](#background)
+-   [Proposal](#proposal)
+-   [Details](#details)
+    -   [The `Cpp` package and namespace mapping](#the-cpp-package-and-namespace-mapping)
+    -   [`import Cpp library` directive](#import-cpp-library-directive)
+    -   [C++ built-in types](#c-built-in-types)
+-   [Rationale](#rationale)
+-   [Alternatives considered](#alternatives-considered)
+
+<!-- tocstop -->
+
+## Abstract
+
+This proposal defines the concrete technical mechanisms for C++
+interoperability. It specifies the precise syntax and semantics for importing
+C++ APIs. This includes the `import Cpp library "..."` and implicitly importing
+C++ built-in entities, and the establishment of the `Cpp` package as the
+dedicated namespace for all imported entities.
+
+## Problem
+
+While Carbon has a stated goal of seamless C++ interoperability, and a
+high-level direction has been agreed upon, there is currently no concrete,
+specified mechanism for developers to actually import and use C++ APIs. This
+proposal aims to address that by defining the specific syntax and semantics for
+C++ interoperability.
+
+## Background
+
+One of Carbon's primary goals is to be a successor language. This strategy is
+entirely dependent on seamless, bidirectional interoperability with C++ to
+enable large-scale adoption and migration for existing C++ codebases.
+
+This proposal provides the necessary details on how C++ APIs should be imported.
+
+## Proposal
+
+We propose to formalize the following specific design elements for C++
+interoperability:
+
+1.  **The `Cpp` Package:** All imported C++ entities, whether from built-ins or
+    library headers (see below), will be nested within a dedicated `Cpp`
+    package. This prevents name collisions with Carbon code and makes the
+    language boundary explicit.
+
+    ```carbon
+    fn UseCppTypes() {
+      // Access C++ types and functions by way of the Cpp package
+      var circle: Cpp.Circle = Cpp.GenerateCircle();
+      Cpp.PrintCircle(circle);
+    }
+    ```
+
+2.  **Importing C++ Header-Defined APIs:** To import C++ APIs from a specific
+    library header file (for example, `<vector>` or `"my_library.h"`), Carbon
+    code will use the `import Cpp library "..."` directive.
+
+    ```carbon
+    import Cpp library "<vector>";
+    import Cpp library "circle.h";
+    ```
+
+3.  **Importing C++ Built-in Entities:** To access fundamental C++ types (such
+    as `int`, `bool`, etc.), no explicit importing is needed and writing
+    `Cpp.int` and `Cpp.bool` would just work.
+
+## Details
+
+### The `Cpp` package and namespace mapping
+
+All C++ declarations will be imported into the `Cpp` package. C++ namespaces
+will be mapped to nested packages within `Cpp`. For example, a C++ function
+`std::string::find` would be accessible in Carbon as `Cpp.std.string.find`. The
+C++ global namespace will be mapped to the `Cpp` package itself. So a function
+`MyGlobalFunction` in the C++ global namespace will be `Cpp.MyGlobalFunction` in
+Carbon.
+
+### `import Cpp library` directive
+
+The `import Cpp library "..."` directive will instruct the Carbon compiler to
+parse the specified C++ header file. The compiler will use the standard C++
+include paths to locate the header. Additional paths can be provided through
+compiler flags.
+
+The Carbon compiler will leverage a C++ front-end, like Clang, to parse the
+headers. This ensures a high degree of compatibility with existing C++ code.
+Only the declarations from the header will be imported, not the definitions
+(unless they are inline).
+
+### C++ built-in types
+
+A set of fundamental C++ types will be available within the `Cpp` package
+without any explicit `import` directive. Mapping examples:
+
+| C++ Type       | Carbon Type        |
+| -------------- | ------------------ |
+| `int`          | `Cpp.int`          |
+| `unsigned int` | `Cpp.unsigned_int` |
+| `double`       | `Cpp.double`       |
+| `float`        | `Cpp.float`        |
+| `bool`         | `Cpp.bool`         |
+| `char`         | `Cpp.char`         |
+
+This automatic availability of built-in types is designed to make basic
+interoperability tasks as smooth as possible.
+
+## Rationale
+
+-   [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
+    -   **Explicitness and Clarity:** The `import Cpp library "..."` directives
+        make all dependencies on C++ headers.
+    -   **Preventing Name Collisions:** The `Cpp` package is a critical design
+        element. It provides a clean, unambiguous namespace for all imported C++
+        code.
+-   [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
+    -   This proposal defines a foundation for seamless C++ interoperability.
+
+## Alternatives considered
+
+-   **Alternative: Explicitly importing built-ins:** We considered making C++
+    built-in types (like `int`) require some `import Cpp` directive like
+    `import Cpp;`.
+    -   **Reason for Rejection:** Since `Cpp` is a special package, it should be
+        implicitly imported, similar to Carbon's prelude.