Explorar el Código

Language-level safety strategy (#196)

-   Based on [#130](https://github.com/carbon-language/carbon-lang/pull/130) from chandlerc
-   [RFC](https://forums.carbon-lang.dev/t/rfc-language-level-safety-strategy-196/182)
-   [Decision](https://forums.carbon-lang.dev/t/request-for-decision-language-level-safety-strategy/196)
-   [Approval annnouncement](https://forums.carbon-lang.dev/t/accepted-language-level-safety-strategy/201)
Jon Meow hace 5 años
padre
commit
1edfb1786e
Se han modificado 4 ficheros con 840 adiciones y 3 borrados
  1. 3 3
      docs/project/goals.md
  2. 770 0
      docs/project/principles/safety_strategy.md
  3. 2 0
      proposals/README.md
  4. 65 0
      proposals/p0196.md

+ 3 - 3
docs/project/goals.md

@@ -19,7 +19,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
         -   [Performance-critical software](#performance-critical-software)
         -   [Software and language evolution](#software-and-language-evolution)
         -   [Code that is easy to read, understand, and write](#code-that-is-easy-to-read-understand-and-write)
-        -   [Practical safety guarantees and testing mechanisms](#practical-safety-guarantees-and-testing-mechanisms)
+        -   [Practical safety and testing mechanisms](#practical-safety-and-testing-mechanisms)
         -   [Fast and scalable development](#fast-and-scalable-development)
         -   [Modern OS platforms, hardware architectures, and environments](#modern-os-platforms-hardware-architectures-and-environments)
         -   [Interoperability with and migration from existing C++ code](#interoperability-with-and-migration-from-existing-c-code)
@@ -157,7 +157,7 @@ We believe Carbon must support:
 1.  Performance-critical software.
 2.  Software and language evolution.
 3.  Code that is easy to read, understand, and write.
-4.  Practical safety guarantees and testing mechanisms.
+4.  Practical safety and testing mechanisms.
 5.  Fast and scalable development.
 6.  Modern OS platforms, hardware architectures, and environments.
 7.  Interoperability with and migration from existing C++ code.
@@ -376,7 +376,7 @@ make the features easier to understand. It's also often the best way to ensure
 predictable performance, although supporting peak performance may require
 options for more complex implementation behavior.
 
-#### Practical safety guarantees and testing mechanisms
+#### Practical safety and testing mechanisms
 
 Our goal is to add as much language-level safety and security to Carbon as
 possible, using a hybrid strategy to balance other goals. We will do as many

+ 770 - 0
docs/project/principles/safety_strategy.md

@@ -0,0 +1,770 @@
+# Safety strategy
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Background](#background)
+    -   [What "safety" means in Carbon](#what-safety-means-in-carbon)
+    -   [Safety guarantees versus hardening](#safety-guarantees-versus-hardening)
+-   [Philosophy](#philosophy)
+-   [Principles](#principles)
+-   [Details](#details)
+    -   [Incremental work when safety requires work](#incremental-work-when-safety-requires-work)
+    -   [Using build modes to manage safety checks](#using-build-modes-to-manage-safety-checks)
+        -   [Debug](#debug)
+        -   [Performance](#performance)
+        -   [Hardened](#hardened)
+    -   [Managing bugs without compile-time safety](#managing-bugs-without-compile-time-safety)
+-   [Caveats](#caveats)
+    -   [Probabilistic techniques likely cannot stop attacks](#probabilistic-techniques-likely-cannot-stop-attacks)
+-   [Alternatives considered](#alternatives-considered)
+    -   [Guaranteed memory safety programming models](#guaranteed-memory-safety-programming-models)
+        -   [Guaranteed compile-time memory safety using borrow checking](#guaranteed-compile-time-memory-safety-using-borrow-checking)
+        -   [Guaranteed run-time memory safety using reference counting](#guaranteed-run-time-memory-safety-using-reference-counting)
+        -   [Guaranteed run-time memory safety using garbage collection](#guaranteed-run-time-memory-safety-using-garbage-collection)
+    -   [Build mode names](#build-mode-names)
+    -   [Performance versus safety in the hardened build mode](#performance-versus-safety-in-the-hardened-build-mode)
+    -   [Add more build modes](#add-more-build-modes)
+
+<!-- tocstop -->
+
+## Background
+
+Carbon's goal is to provide
+[practical safety and testing mechanisms](../goals.md#practical-safety-and-testing-mechanisms).
+
+### What "safety" means in Carbon
+
+Safety is protection from software bugs, whether the protection is required by
+the language or merely an implementation option. Application-specific logic
+errors can be prevented by testing, but can lead to security vulnerabilities in
+production. Safety categories will be referred to using names based on the type
+of
+[security vulnerability](<https://en.wikipedia.org/wiki/Vulnerability_(computing)#Software_vulnerabilities>)
+they protect against.
+
+A key subset of safety categories Carbon should address are:
+
+-   [**Memory safety**](https://en.wikipedia.org/wiki/Memory_safety) protects
+    against invalid memory accesses. Carbon uses
+    [two main subcategories](https://onlinelibrary.wiley.com/doi/full/10.1002/spe.2105)
+    for memory safety:
+
+    -   _Spatial_ memory safety protects against accessing an address that's out
+        of bounds for the source. This includes array boundaries, as well as
+        dereferencing invalid pointers such as uninitialized pointers, `NULL` in
+        C++, or manufactured pointer addresses.
+
+    -   _Temporal_ memory safety protects against accessing an address that has
+        been deallocated. This includes use-after-free for heap and
+        use-after-return for stack addresses.
+
+-   [**Type safety**](https://en.wikipedia.org/wiki/Type_safety) protects
+    against accessing valid memory with an incorrect type, also known as "type
+    confusion".
+
+-   [**Data race safety**](https://en.wikipedia.org/wiki/Race_condition#Data_race)
+    protects against racing memory access: when a thread accesses (read or
+    write) a memory location concurrently with a different writing thread and
+    without synchronizing.
+
+### Safety guarantees versus hardening
+
+In providing safety, the underlying goal is to prevent attacks from turning a
+_logic error_ into a _security vulnerability_. The three ways of doing this can
+be thought of in terms of how they prevent attacks:
+
+-   **Safety guarantees** prevent bugs. They offer a strong requirement that a
+    particular security vulnerability cannot exist. Compile-time safety checks
+    are always a safety guarantee, but safety guarantees may also be done at
+    runtime. For example:
+
+    -   At compile-time, range-based for loops offer a spatial safety guarantee
+        that out-of-bounds issues cannot exist in the absence of concurrent
+        modification of the sequence.
+
+    -   At runtime, garbage collected languages offer a temporal safety
+        guarantee because objects cannot be freed while they're still
+        accessible.
+
+-   **Error detection** checks for common logic errors at runtime. For example:
+
+    -   An array lookup function might offer spatial memory error detection by
+        verifying that the passed index is in-bounds.
+
+    -   A program can implement reference counting to detect a temporal memory
+        error by checking whether any references remain when memory is freed.
+
+-   **Safety hardening** mitigates bugs, typically by minimizing the feasibility
+    of an attack. For example:
+
+    -   [Control Flow Integrity (CFI)](https://en.wikipedia.org/wiki/Control-flow_integrity)
+        monitors for behavior which can subvert the program's control flow. In
+        [Clang](http://clang.llvm.org/docs/ControlFlowIntegrity.html), it is
+        optimized for use in release builds. Typically CFI analysis will only
+        detect a subset of attacks because it can't track each possible code
+        path separately. It should still reduce the feasibility of both spatial
+        memory, temporal memory, and type attacks.
+
+    -   [Memory tagging](https://llvm.org/devmtg/2018-10/slides/Serebryany-Stepanov-Tsyrklevich-Memory-Tagging-Slides-LLVM-2018.pdf)
+        makes each attempt at an invalid read or write operation have a high
+        probability of trapping, while still not detecting the underlying bug in
+        every case. Realistic attacks require many such operations, so memory
+        tagging may stop attacks in some environments. Alternatively, the trap
+        might be asynchronous, leaving only a tiny window of time prior to the
+        attack being detected and program terminated. These are probabilistic
+        hardening and reduces the feasibility of both spatial and temporal
+        memory attacks.
+
+Under both error detection and safety hardening, even if a safety is protected,
+the underlying bugs will still exist and will need to be fixed. For example,
+program termination could be used for a denial-of-service attack.
+
+## Philosophy
+
+Carbon's
+[practical safety and testing mechanisms](../goals.md#practical-safety-and-testing-mechanisms)
+will emphasize guaranteed safety where feasible without creating barriers to
+Carbon's [other goals](../goals.md), particularly performance and
+interoperability. This limits Carbon's options for guaranteed safety, and as a
+result there will be more reliance upon error detection and safety hardening.
+The language's design should incentivize safe programming, although it will not
+be required.
+
+When writing code, Carbon developers should expect to receive safety without
+needing to add safety annotations. Carbon will have optional safety annotations
+for purposes such as optimizing safety checks or providing information that
+improves coverage of safety checks.
+
+Carbon will favor compile-time safety checks because catching issues early will
+make applications more reliable. Runtime checks, either error detection or
+safety hardening, will be enabled where safety cannot be proven at compile-time.
+
+There will be three high-level use cases or directions that Carbon addresses
+through different build modes that prioritize safety checks differently:
+
+-   A [debug](#debug) oriented build mode that prioritizes detecting bugs and
+    reporting errors helpfully.
+-   A [performance](#performance) oriented build mode that skips any dynamic
+    safety checks to reduce overhead.
+-   A [hardened](#hardened) oriented build mode that prioritizes ensuring
+    sufficient safety to prevent security vulnerabilities, although it may not
+    allow detecting all of the bugs.
+
+These high level build modes may be tuned, either to select specific nuanced
+approach for achieving the high level goal, or to configure orthogonal
+constraints such as whether to prioritize binary size or execution speed.
+However, there is a strong desire to avoid requiring more fundamental build
+modes to achieve the necessary coverage of detecting bugs and shipping software.
+These build modes are also not expected to be interchangeable or compatible with
+each other within a single executable -- they must be a global selection.
+
+Although expensive safety checks could be provided through additional build
+modes, Carbon will favor safety checks that can be combined into these three
+build modes rather than adding more.
+
+Over time, safety should [evolve](../goals.md#software-and-language-evolution)
+using a hybrid compile-time and runtime safety approach to eventually provide a
+similar level of safety to a language that puts more emphasis on guaranteed
+safety, such as [Rust](#guaranteed-safety-by-default-rusts-model). However,
+while Carbon may _encourage_ developers to modify code in support of more
+efficient safety checks, it will remain important to improve the safety of code
+for developers who cannot invest into safety-specific code modifications.
+
+## Principles
+
+-   Safety must be
+    [easy to ramp-up with](../goals.md#code-that-is-easy-to-read-understand-and-write),
+    even if it means new developers don't receive the full safety that Carbon
+    can offer.
+
+    -   Developers should benefit from Carbon's safety without needing to learn
+        and apply Carbon-specific design patterns. Some safety should be enabled
+        by default, without safety-specific work, although some safety will
+        require work to opt in. Developers concerned with performance should
+        only need to disable safety in rare edge-cases.
+
+    -   Where there is a choice between safety approaches, the safe option
+        should be incentivized by making it equally easy or easier to use. If
+        there is a default, it should be the safe option. It should be
+        identifiable when the unsafe option is used. Incentives will prioritize,
+        in order:
+
+        1.  Guaranteed safety.
+        2.  Error detection.
+        3.  Safety hardening.
+        4.  Unsafe and unmitigated code.
+
+    -   Language design choices should favor more efficient implementations of
+        safety checks. They should also allow favor automation of testing and
+        fuzzing.
+
+-   Safety in Carbon must work with
+    [interoperable or migrated C++ code](../goals.md#interoperability-with-and-migration-from-existing-c-code),
+    so that C++ developers can readily take advantage of Carbon's improvements.
+
+    -   Safety mechanisms will ideally be designed to apply to automatically
+        migrated C++ code. Providing immediate safety improvements to Carbon
+        adopters will help motivate adoption.
+
+    -   In the other direction, safety mechanisms must not force manual
+        rewriting of C++ code in order to migrate, either by creating design
+        incompatibilities or performance degradations. Automated migration of
+        C++ code to Carbon must work for most developers, even if it forces
+        Carbon's safety design to take a different approach.
+
+    -   Carbon's safety should degrade gracefully when Carbon code calls C++
+        code, although this may require use of the Carbon toolchain to compile
+        the C++ code. Applications should be expected to use interoperability.
+        Although some safety features will be Carbon-specific, safety should not
+        stop at the language boundary.
+
+-   The rules for determining whether code will pass compile-time safety
+    checking should be articulable, documented, and easy to understand.
+
+    -   Compile-time safety checks should not change significantly across
+        different build modes. The purpose of the build modes is to determine
+        code generation.
+
+-   Each build mode will prioritize performance and safety differently:
+
+    -   The [debug build mode](#debug) will produce development-focused binaries
+        that prioritize fast iteration on code with safety checks that assist in
+        identification and debugging of errors.
+
+    -   The [performance build mode](#performance) will produce release-focused
+        binaries that prioritize performance over safety.
+
+    -   The [hardened build mode](#hardened) will produce release-focused
+        binaries that prioritize safety that is resistant to attacks at the cost
+        of performance.
+
+-   Safety checks should try to be identical across build modes.
+
+    -   There will be differences, typically due to performance overhead and
+        detection rate trade-offs of safety check algorithms.
+
+-   The number of build modes will be limited, and should be expected to remain
+    at the named three.
+
+    -   Most developers will use two build modes in their work: debug for
+        development, and either performance or hardened for releases.
+
+    -   It's important to focus on checks that are cheap enough to run as part
+        of normal development. Users are not expected to want to run additional
+        development build modes for additional sanitizers.
+
+    -   Limiting the number of build modes simplifies support for both Carbon
+        maintainers, who can focus on a more limited set of configurations, and
+        Carbon developers, who can easily choose which is better for their
+        use-case.
+
+-   Each distinct safety-related build mode (debug, performance, and hardened)
+    cannot be combined with others in the same binary.
+
+    -   Cross-binary interfaces will exist in Carbon, and will need to be used
+        by developers interested in combining libraries built under different
+        build modes.
+
+-   Although runtime safety checks should prevent logic errors from turning into
+    security vulnerabilities, the underlying logic errors will still be bugs.
+    For example, some safety checks would result in application termination;
+    this prevents execution of unexpected code and still needs to be fixed.
+
+-   Developers need a strong testing methodology to engineer correct software.
+    Carbon will encourage testing and then leverage it with the checking build
+    modes to find and fix bugs and vulnerabilities.
+
+## Details
+
+### Incremental work when safety requires work
+
+Carbon is prioritizing usability of the language, particularly minimizing
+retraining of C++ developers and easing migration of C++ codebases, over the
+kind of provable safety that some other languages pursue, particularly Rust.
+
+A key motivation of Carbon is to move C++ developers to a better, safer
+language. However, if Carbon requires manually rewriting or redesigning C++ code
+in order to maintain performance, it creates additional pressure on C++
+developers to learn and spend time on safety. Safety will often not be the top
+priority for developers; as a result, Carbon must be thoughtful about how and
+when it forces developers to think about safety.
+
+Relying on multiple build modes to provide safety should fit into normal
+development workflows. Carbon can also have features to enable additional
+safety, so long as developers can start using Carbon in their applications
+_without_ learning new paradigms.
+
+Where possible, safety checks shouldn't require work on the part of Carbon
+developers. A safety check that requires no code edits or can be handled by
+automated migration may be opt-out, as there is negligible cost to developers.
+One which requires local code changes should be opt-in because costs will scale
+with codebase size. Safety check approaches which would require substantial
+redesign by developers will be disfavored based on adoption cost, even if the
+alternative is a less-comprehensive approach.
+
+### Using build modes to manage safety checks
+
+Carbon will likely start in a state where most safety checks are done at
+runtime. However, runtime detection of safety violations remains expensive. In
+order to make as many safety checks as possible available to developers, Carbon
+will adopt a strategy based on three build modes that target key use-cases.
+
+#### Debug
+
+The debug build mode targets developers who are iterating on code and running
+tests. It will emphasize detection and debugability, especially for safety
+issues.
+
+It needs to perform well enough to be run frequently by developers, but will
+make performance sacrifices to catch more safety issues. This mode should have
+runtime checks for the most common safety issues, but it can make trade-offs
+that improve performance in exchange for less frequent, but still reliable,
+detection. Developers should do most of their testing in this build mode.
+
+The debug build mode will place a premium on the debugability of safety
+violations. Where safety checks rely on hardening instead of guaranteed safety,
+violations should be detected with a high probability per single occurrence of
+the bug. Detected bugs will be accompanied by a detailed diagnostic report to
+ease classification and root cause identification.
+
+#### Performance
+
+The performance build mode targets the typical application that wants high
+performance from Carbon code, where performance considers processing time,
+memory, and disk space. Trade-offs will be made that maximize the performance.
+
+Only safety techniques that don't measurably impact application hot path
+performance will be enabled by default. This is a very high bar, but is crucial
+for meeting Carbon's performance goals, as well as allowing migration of
+existing C++ systems which may not have been designed with Carbon's safety
+semantics in mind.
+
+#### Hardened
+
+The hardened build mode targets applications where developers want strong safety
+against attacks in exchange for worse performance. It will work to prevent
+attacks in ways that
+[attackers cannot work around](#probabilistic-techniques-likely-cannot-stop-attacks),
+even if it means using techniques that create significant performance costs.
+
+### Managing bugs without compile-time safety
+
+Carbon's reliance on runtime checks will allow developers to manage their
+security risk. Developers will still need to reliably find and fix the
+inevitable bugs, including both safety violations and regular business logic
+bugs. The cornerstone of managing bugs will be strong testing methodologies,
+with built-in support from Carbon.
+
+Strong testing is more than good test coverage. It means a combination of:
+
+-   Ensuring unsafe or risky operations and interfaces can easily be recognized
+    by developers.
+
+-   Using static analysis tools to detect common bugs, and ensuring they're
+    integrated into build and code review workflows. These could be viewed as
+    static testing of code.
+
+-   Writing good test coverage, including unit, integration, and system tests.
+
+-   Implementing coverage-directed fuzz testing to discover bugs outside of
+    manually authored test coverage, especially for interfaces handling
+    untrusted data. Fuzz testing is a robust way to catch bugs when APIs may be
+    used in ways developers don't consider.
+
+-   Running continuous integration, including automatic and continuous running
+    of these tests. The checked development build mode should be validated, as
+    well as any additional build modes necessary to cover different forms of
+    behavior checking.
+
+-   Easing automated testing and fuzzing through language features. For example,
+    if the language encourages value types and pure functions of some sort, they
+    can be automatically fuzzed.
+
+These practices are necessary for reliable, large-scale software engineering.
+Maintaining correctness of business logic over time requires continuous and
+thorough testing. Without it, such software systems cannot be changed and
+evolved over time reliably. Carbon will re-use these practices in conjunction
+with checking build modes to mitigate the limitations of Carbon's safety
+guarantees without imposing overhead on production systems.
+
+When a developer chooses to use Carbon, adhering to this kind of testing
+methodology is essential for maintaining safety. As a consequence, Carbon's
+ecosystem, including the language, tools, and libraries, will need to directly
+work to remove barriers and encourage the development of these methodologies.
+
+The reliance on testing may make Carbon a poor choice in some environments; in
+environments where such testing rigor is infeasible, a language with a greater
+degree of static checking may be better suited.
+
+## Caveats
+
+### Probabilistic techniques likely cannot stop attacks
+
+It's expected that probabilistic techniques that can be applied at the language
+level are attackable through a variety of techniques:
+
+-   The attacker might be able to attack repeatedly until it gets through.
+-   The attacker may be able to determine when the attack would be detected and
+    only run the attack when it would not be.
+-   The attacker might be able control the test condition to make detection much
+    less likely or avoid detection completely. For example, if detection is
+    based on the last 4 bits of a memory address, an attacker may be able to
+    generate memory allocations, viewing the address and only attacking when
+    there's a collision.
+
+Hardware vulnerabilities may make these attacks easier than they might otherwise
+appear. Future hardware vulnerabilities are difficult to predict.
+
+Note this statement focuses on what can be applied to the language level. Using
+a secure hash algorithm, such as SHA256, may be used to offer probabilistic
+defense in other situations. However, the overhead of a secure hash algorithm's
+calculation is significant in the context of most things that Carbon may do at
+the language level.
+
+Combining these issues, although it may seem like a probabilistic safety check
+could be proven to reliably detect attackers, it's likely infeasible to do so.
+For the various build modes, this means:
+
+-   The debug build mode will not typically be accessible to attackers, so where
+    a probabilistic technique provides a better developer experience, it will be
+    preferred.
+-   The performance build mode will often avoid safety checks in order to reach
+    peak performance. As a consequence, even the weak protection of a
+    probabilistic safety check may be used in order to provide _some_
+    protection.
+-   The hardened build mode will prefer non-probabilistic techniques that
+    _cannot_ be attacked.
+
+## Alternatives considered
+
+### Guaranteed memory safety programming models
+
+Multiple approaches that would offer guaranteed memory safety have been
+considered, mainly based on other languages which offer related approaches.
+Carbon will likely rely more on error detection and hardening because of what
+the models would mean for Carbon's performance and C++ migration language goals.
+
+#### Guaranteed compile-time memory safety using borrow checking
+
+Rust offers a good example of an approach for compile-time safety based on
+borrow checking, which provides guaranteed safety. For code which can't
+implement borrow checking, runtime safety using reference counting is available
+and provides reliable error detection. This approach still allows for
+[`unsafe` blocks](https://doc.rust-lang.org/rust-by-example/unsafe.html), as
+well as types that offer runtime safety while wrapping `unsafe` interfaces.
+
+Carbon could use a similar approach for guaranteed safety by default.
+
+Advantages:
+
+-   Guaranteed safety, including against data races, is provided for the
+    binaries.
+    -   The emphasis on compile-time safety limits the scope of the runtime
+        memory safety costs.
+    -   With Rust, there is early evidence that there's a significant impact in
+        reducing bugs generally.
+-   Imitating Rust's techniques would allow building on the huge work of the
+    Rust community, reducing the risks of implementing similar in Carbon.
+-   Careful use of narrow `unsafe` escape hatches can be effectively
+    encapsulated behind otherwise safe APIs.
+
+Disadvantages:
+
+-   Rust's approach to compile-time safety requires use of
+    [design patterns and idioms](https://github.com/rust-unofficial/patterns)
+    that are substantially different from C++.
+    -   Conversion of C++ code to Rust results in either rewrites of code, or
+        use of runtime safety checks that impair performance.
+    -   Requires fully modeling lifetime and exclusivity in the type system.
+    -   Data structures must be redesigned to avoid sharing mutable state.
+    -   Increases complexity of node and pointer based data structures, such as
+        linked lists.
+-   Imitating Rust's techniques may prove insufficient for achieving Carbon's
+    [compiler performance goals](../goals.md#fast-and-scalable-development).
+    Rust compilation performance suggests its borrow checking performance is
+    slow, although it's difficult to determine how significant this is or
+    whether it could be improved.
+    -   The Rust compiler
+        [is slow](https://pingcap.com/blog/rust-compilation-model-calamity),
+        although
+        [much has been done to improve it](https://blog.mozilla.org/nnethercote/2020/09/08/how-to-speed-up-the-rust-compiler-one-last-time/).
+    -   Details of type checking, particularly requiring parsing of function
+        bodies to type check signatures, as well as wide use of
+        [monomorphization](https://doc.rust-lang.org/book/ch10-01-syntax.html)
+        are likely significant contributors to Rust compilation performance.
+    -   LLVM codegen is also a significant cost for Rust compilation
+        performance.
+    -   With
+        [Fuchsia](https://fuchsia.dev/fuchsia-src/development/languages/rust) as
+        an example, in December 2020, borrow checking and type checking combined
+        account for around 10% of Rust compile CPU time, or 25% of end-to-end
+        compile time. The current cost of borrow checking is obscured both
+        because of the combination with type checking, and because Fuchsia
+        disables some compiler parallelization due to build system
+        incompatibility.
+-   The complexity of using Rust's compile-time safety may incentivize
+    unnecessary runtime checking of safety properties. For example, using
+    [`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) or
+    [`Rc`](https://doc.rust-lang.org/std/rc/struct.Rc.html) to avoid changing
+    designs to fit compile-time safety models.
+-   Some of the most essential safety tools that ease the ergonomic burden of
+    the Rust-style lifetime model (`Rc`) introduce _semantic_ differences that
+    cannot then be eliminated in a context where performance is the dominant
+    priority.
+
+It's possible to modify the Rust model several ways in order to reduce the
+burden on C++ developers:
+
+-   Don't offer safety guarantees for data races, eliminating `RefCell`.
+    -   This would likely not avoid the need for `Rc` or `Arc`, and wouldn't
+        substantially reduce the complexity.
+-   Require manual destruction of `Rc`, allowing safety checks to be disabled in
+    the performance build mode to eliminate overhead.
+    -   This still requires redesigning C++ code to take advantage of `Rc`.
+    -   The possibility of incorrect manual destruction means that the safety
+        issue is being turned into a bug, which means it is hardening and no
+        longer a safety guarantee.
+    -   Carbon can provide equivalent hardening through techniques such as
+        [MarkUs](https://www.cl.cam.ac.uk/~tmj32/papers/docs/ainsworth20-sp.pdf),
+        which does not require redesigning C++ code.
+
+Overall, Carbon is making a compromise around safety in order to give a path for
+C++ to evolve. C++ developers must be comfortable migrating their codebases, and
+able to do so in a largely automated manner. In order to achieve automated
+migration, Carbon cannot require fundamental redesigns of migrated C++ code.
+While a migration tool could in theory mark all migrated code as `unsafe`,
+Carbon should use a safety strategy that degrades gracefully and offers
+improvements for C++ code, whether migrated or not.
+
+That does not mean Carbon will never adopt guaranteed safety by default, only
+that performance and migration of C++ code takes priority, and any design will
+need to be considered in the context of other goals. It should still be possible
+to adopt guaranteed safety later, although it will require identifying a
+migration path.
+
+#### Guaranteed run-time memory safety using reference counting
+
+[Reference counting](https://en.wikipedia.org/wiki/Reference_counting) is a
+common memory safety model, with Swift as a popular example.
+
+Advantages:
+
+-   Simple model for safety, particularly as compared with Rust.
+-   Safe for all of the most common and important classes of memory safety bugs.
+
+Disadvantages:
+
+-   Safety based on reference counting introduces significant performance costs,
+    and tools for controlling these costs are difficult.
+    -   Safety based on garbage collection has less direct performance overhead,
+        but has a greater unpredictability of performance.
+-   Significant design differences versus C++ still result, as the distinction
+    between value types and "class types" becomes extremely important.
+    -   Class types are held by a reference counted pointer and are thus
+        lifetime safe.
+
+In order to mitigate the performance overhead, Swift does have a proposal to add
+an option for unique ownership, although the specifics are not designed yet. The
+unique ownership approach is expected to require unowned and unsafe access, so
+it would not considered to improve the safety trade-offs.
+
+Swift was designated by Apple as the replacement for Objective-C. The safety
+versus performance trade-offs that it makes fit Apple's priorities. Carbon's
+performance goals should lead to different trade-off decisions with a higher
+priority on peak performance, which effectively rules out broad use of reference
+counting.
+
+#### Guaranteed run-time memory safety using garbage collection
+
+[Garbage collection](<https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)>)
+is a common memory safety model, with Java as a popular example.
+
+Advantages:
+
+-   This approach is among the most robust and well studied models, with decades
+    of practical usage and analysis for security properties.
+-   Extremely suitable for efficient implementation on top of a virtual machine,
+    such as the JVM.
+
+Disadvantages:
+
+-   Extremely high complexity to fully understand the implications of complex
+    cases like data races.
+-   Performance overhead is significant in terms of what Carbon would like to
+    consider.
+    -   Garbage collection remains a difficult performance problem, even for the
+        JVM and its extensive optimizations.
+    -   The complexity of the implementation makes it difficult to _predict_
+        performance; for example, Java applications experience latency spikes
+        when garbage collection runs.
+
+Java is a good choice for many applications, but Carbon is working to focus on a
+set of performance priorities that would be difficult to achieve with a garbage
+collector.
+
+### Build mode names
+
+The build mode concepts are difficult to name. Other names that were evaluated,
+and are ultimately similar, are:
+
+-   "Debug" is a common term for the intended use of this build mode. Also,
+    tooling including Visual Studio frequently uses the debug term for
+    describing similar.
+
+    -   "Development" was also considered, but this term is less specific and
+        would be better for describing all non-release builds together. For
+        example, a "fast build" mode might be added that disables safety checks
+        to improve iteration time, like might be controlled by way of C++'s
+        `NDEBUG` option.
+
+-   "Performance" aligns with the phrasing of the language performance goal.
+
+    -   "Optimized" implies that other modes would not be fully optimized, but
+        hardened should be optimized.
+
+    -   "Fast" would suggest that speed is the only aspect of performance being
+        optimizing for, but "performance" also optimizes for memory usage and
+        binary size.
+
+-   "Hardened" is the choice for succinctly describing the additional safety
+    measures that will be taken, and is a well-known term in the safety space.
+    It could be incorrectly inferred that "performance" has no hardening, but
+    the preference is to clearly indicate the priority of the "hardened" build
+    mode.
+
+    -   "Safe" implies something closer to guaranteed safety. However, safety
+        bugs should be expected to result in program termination, which can
+        still be used in other attacks, such as Denial-of-Service.
+
+    -   "Mitigated" is an overloaded term, and it may not be succinctly clear
+        that it's about security mitigations.
+
+-   Some terms which were considered and don't fit well into the above groups
+    are:
+
+    -   "Release" is avoided because both "performance" and "hardened" could be
+        considered to be "release" build modes.
+
+The names "performance" and "hardened" may lead to misinterpretations, with some
+developers who should use "hardened" using "performance" because they are
+worried about giving up too much performance, and the other way around. The
+terms try to balance the utility of well-known terminology with the succinctness
+of a short phrase for build modes, and that limits the expressivity. Some
+confusion is expected, and documentation as well as real-world experience (for
+example, a developer who cares about latency benchmarking both builds) should be
+expected to help mitigate mix-ups.
+
+### Performance versus safety in the hardened build mode
+
+The performance cost of safety techniques are expected to be non-linear with
+respect to detection rates. For example, a particular vulnerability such as heap
+use-after-free may be detectable with 99% accuracy at 20% performance cost, but
+100% accuracy at 50% performance cost. At present, build modes should be
+expected to evaluate such a scenario as:
+
+-   The debug build mode would choose the 99% accurate approach.
+    -   Detecting safety issues is valuable for debugging.
+    -   The probabilistic detection rate won't meaningfully affect accuracy of
+        tests.
+    -   The lower performance cost improves developer velocity.
+-   The performance build mode would decline detection.
+    -   Safety checks with a measurable performance cost should be declined.
+-   The hardened build mode would choose the 100% accurate approach.
+    -   Safety must be non-probabilistic in order to reliably prevent attacks.
+    -   Significant performance hits are acceptable.
+    -   This means the hardened build mode may be slower than the debug build
+        mode.
+
+In order to achieve better performance, the hardened build mode could make
+trade-offs closer to the debug build mode. Rather than relying on
+non-probabilistic techniques, it could instead offer a probability-based chance
+of detecting a given attack.
+
+Advantages:
+
+-   Probabilistic safety should come at lower performance cost (including CPU,
+    memory, and disk space).
+    -   This will sometimes be significant, and as a result of multiple checks,
+        could be the difference between the hardened build mode being 50% slower
+        than the performance build mode and being 200% slower.
+
+Disadvantages:
+
+-   [Probabilistic techniques likely cannot stop attacks](#probabilistic-techniques-likely-cannot-stop-attacks).
+    -   Attackers may be able to repeat attacks until they succeed.
+    -   The variables upon which the probability is based, such as memory
+        addresses, may be manipulable by the attacker. As a consequence, a
+        determined attacker may be able to manipulate probabilities and not even
+        be detected.
+
+Although performance is
+[Carbon's top goal](../goals.md#language-goals-and-priorities), the hardened
+build mode exists to satisfy developers and environments that value safety more
+than performance. The hardened build mode will rely on non-probabilistic safety
+at significant performance cost because other approaches will be insufficient to
+guard against determined attackers.
+
+### Add more build modes
+
+More build modes could be added to this principle, or the principle could
+encourage the idea that specific designs may add more.
+
+To explain why three build modes:
+
+-   The concept of debug and release (sometimes called opt) are common. For
+    example, in
+    [Visual Studio](https://docs.microsoft.com/en-us/visualstudio/debugger/how-to-set-debug-and-release-configurations?view=vs-2019).
+    In Carbon, this could be considered to translate to the "debug" and
+    "performance" build modes by default.
+
+-   The hardened build mode is added in order to emphasize security. Although
+    hardened could be implemented as a set of options passed to the standard
+    release build mode, the preference is to focus on it as an important
+    feature.
+
+An example of why another build mode may be needed is
+[ThreadSanitizer](https://clang.llvm.org/docs/ThreadSanitizer.html), which is
+noted as having 5-15x slowdown and 5-10x memory overhead. This is infeasible for
+normal use, but could be useful for some users in a separate build mode. A
+trade-off that's possible for Carbon is instead using an approach similar to
+[KCSAN](https://github.com/google/ktsan/wiki/KCSAN) which offers relatively
+inexpensive but lower-probability race detection.
+
+Although options to these build modes may be supported to customize deployments,
+the preference is to focus on a small set and make them behave well. For
+example, if a separate build mode is added for ThreadSanitizer, it should be
+considered a temporary solution until it can be merged into the debug build
+mode.
+
+Advantages:
+
+-   Grants more flexibility for using build modes as a solution to problems.
+    -   With safety checks, this would allow providing safety checks that are
+        high overhead but also high detection rate as separate build modes.
+    -   With other systems, there could be non-safety performance versus
+        behavior trade-offs.
+
+Disadvantages:
+
+-   Having standard modes simplifies validation of interactions between various
+    safety checks.
+    -   Safety is the only reason that's been considered for adding build modes.
+-   As more build modes are added, the chance of developers being confused and
+    choosing the wrong build mode for their application increases.
+
+Any long-term additions to the set of build modes will need to update this
+principle, raising the visibility and requiring more consideration of such an
+addition. If build modes are added for non-safety-related reasons, this may lead
+to moving build modes out of the safety strategy.
+
+**Experiment**: This can be considered an experiment. Carbon may eventually add
+more than the initial three build modes, although the reticence to add more is
+likely to remain.

+ 2 - 0
proposals/README.md

@@ -57,6 +57,8 @@ request:
     -   [0175 - Decision](p0175_decision.md)
 -   [0179 - Create a toolchain team.](p0179.md)
     -   [0179 - Decision](p0179_decision.md)
+-   [0196 - Language-level safety strategy](p0196.md)
+    -   [0196 - Decision](p0196_decision.md)
 -   [0198 - Comments](p0198.md)
 -   [0199 - String literals](p0199.md)
 

+ 65 - 0
proposals/p0196.md

@@ -0,0 +1,65 @@
+# Language-level safety strategy
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/196)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Problem](#problem)
+-   [Background](#background)
+-   [Proposal](#proposal)
+-   [Open question: probabilistic checks in the hardened build mode](#open-question-probabilistic-checks-in-the-hardened-build-mode)
+
+<!-- tocstop -->
+
+## Problem
+
+Carbon needs to have a clear and consistent strategy for approaching the
+problems of language-level safety. These problems have been persistent and
+growing sources of both bugs and security vulnerabilities in C and C++ software.
+Failure to effectively and carefully address safety concerns is likely to
+undermine any hope of Carbon being a successful path forward for today's C++
+users.
+
+## Background
+
+-   [Fearless Security: Memory Safety](https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/)
+-   [A proactive approach to more secure code](https://msrc-blog.microsoft.com/2019/07/16/a-proactive-approach-to-more-secure-code/)
+-   [Chromium memory safety](https://www.chromium.org/Home/chromium-security/memory-safety)
+-   [MemSafe](https://doi.org/10.1002/spe.2105)
+    -   Notably introduces the terms "_spatial_" and "_temporal_" safety.
+
+## Proposal
+
+We propose a safety strategy for Carbon that aims for incrementally increasing
+the compile-time proven safety while allowing for dynamic checks to cover what
+remains. It also prioritizes dynamic safety checks that are amenable to being
+optimized away or being manually disabled for performance-critical use cases
+where the added dynamic protections are not a viable trade-off.
+
+## Open question: probabilistic checks in the hardened build mode
+
+This proposal explicitly discourages probabilistic checks in the hardened build
+mode because they won't reliably prevent security attacks. Does the core team
+believe that we should put a cap on how much performance should be sacrificed
+for safety, putting more emphasis on probabilistic methods that would allow more
+attacks through?
+
+For example,
+[heap use-after-free detection](https://docs.google.com/document/d/14-_JAMoMBIVgKOUi3iZIRMHJlAyT23kqSVR8wxJPk9M/edit)
+with 100% accuracy is expected to be a significant expense for hardened builds.
+MarkUs is estimated to cost 10-50% CPU and 25% RAM in order to catch 100% of
+issues. For comparison, MTE is estimated to cost 0-20% CPU and 3-6% RAM in order
+to catch 93% of issues.
+
+The CPU and RAM cost of MarkUs is significant, even by comparison with other
+techniques, and costs will add up as more safety is added. 93% is a reasonably
+high detection rate for an performance-efficient, probabilistic technique. Would
+the core team expect to use MTE instead MarkUs in hardened builds?