فهرست منبع

Replace the toolchain README with a docs link. (#2482)

The doc is more up-to-date than the readme right now, and I'm not ready to migrate it back for the moment, but this should at least make it clear what the status is.
Jon Ross-Perkins 3 سال پیش
والد
کامیت
6accdfff77
1فایلهای تغییر یافته به همراه3 افزوده شده و 195 حذف شده
  1. 3 195
      toolchain/README.md

+ 3 - 195
toolchain/README.md

@@ -6,198 +6,6 @@ Exceptions. See /LICENSE for license information.
 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 -->
 
-The toolchain represents the production portion of Carbon. At a high level, the
-toolchain's top priorities are:
-
--   Correctness.
--   Quality of generated code, including its performance.
--   Compilation performance.
--   Quality of diagnostics for incorrect or questionable code.
-
-TODO: Add an expanded document that fully explains the goals and priorities and
-link to it here.
-
-The compiler is organized into a collection of libraries that can be used
-independently. This includes the `//toolchain/driver` libraries that orchestrate
-the typical and expected compilation flow using the other libraries. The driver
-also includes the primary command-line tool: `//toolchain/driver:carbon`.
-
-The typical compilation flow of data is:
-
-1. Load the file into a [SourceBuffer](source/source_buffer.h).
-2. Lex a `SourceBuffer` into a [TokenizedBuffer](lexer/tokenized_buffer.h).
-3. Parse a `TokenizedBuffer` into a [ParseTree](parser/parse_tree.h).
-4. Transform a `ParseTree` into a [SemanticsIR](semantics/semantics_ir.h).
-5. This flow is still incomplete: code generation, using LLVM, is still
-   required.
-
-## Lexing
-
-The [TokenizedBuffer](lexer/tokenized_buffer.h) is the central point of lexing.
-
-The entire source buffer is converted into tokens before parsing begins. Tokens
-are referred to by an opaque handle, `TokenizedBuffer::Token`, which is
-represented as a dense integer index into the buffer. The tokenized buffer can
-be queried to discover information about a token, such as its token kind, its
-location in the source file, and its spelling.
-
-The lexer ensures that all forms of brackets are matched, and is intended to
-recover from missing brackets based on contextual cues such as indentation
-(although this is not yet implemented), inserting matching close bracket tokens
-where it thinks they belong. After the lexer completes, every opening bracket
-token has a matching closing bracket token.
-
-## Parsing
-
-The [ParseTree](parser/parse_tree.h) is the output of parsing, but most logic is
-in [Parser](parser/parser.h).
-
-The parse tree faithfully represents the tree structure of the source program,
-interpreted according to the Carbon grammar. No semantics are associated with
-the tree structure at this level, and no name lookup is performed.
-
-Each parse tree node has an expected structure, corresponding to the grammar of
-the Carbon language, and the parser ensures that a valid parse tree node always
-has a valid structure. However, any parse tree node can be marked as invalid,
-and an invalid parse tree node can contain child nodes of any kind in any order.
-This is intended to model the situation where parsing failed because the code
-did not match the grammar, but we were still able to parse some subexpressions,
-as an aid for non-compiler tools such as syntax highlighters or refactoring
-tools.
-
-The produced `ParseTree` is in postorder. For example, given the code:
-
-```carbon
-fn foo() -> f64 {
-  return 42;
-}
-```
-
-The node order is (with indentation to indicate nesting):
-
-```
-    {node_index: 0, kind: 'FunctionIntroducer', text: 'fn'}
-    {node_index: 1, kind: 'DeclaredName', text: 'foo'}
-      {node_index: 2, kind: 'ParameterListEnd', text: ')'}
-    {node_index: 3, kind: 'ParameterList', text: '(', subtree_size: 2}
-      {node_index: 4, kind: 'Literal', text: 'f64'}
-    {node_index: 5, kind: 'ReturnType', text: '->', subtree_size: 2}
-  {node_index: 6, kind: 'FunctionDefinitionStart', text: '{', subtree_size: 7}
-    {node_index: 7, kind: 'Literal', text: '42'}
-    {node_index: 8, kind: 'StatementEnd', text: ';'}
-  {node_index: 9, kind: 'ReturnStatement', text: 'return', subtree_size: 3}
-{node_index: 10, kind: 'FunctionDefinition', text: '}', subtree_size: 11}
-{node_index: 11, kind: 'FileEnd', text: ''}
-```
-
-This ordering is focused on efficient translation into the SemanticsIR.
-Non-template code should be type-checked as soon as nodes are encountered,
-decreasing SemanticsIR mutations.
-
-While sometimes the beginning of the grammatical construct will be the parent,
-where introducer keywords are used, it will often be the _end_ of the
-grammatical construct that is the parent: this is so that a postorder traversal
-of the tree can see the kind of grammatical construct being built first, and
-handle child nodes taking that into account.
-
-TODO: Document flow.
-
-## Semantics
-
-The [SemanticsIR](semantics/semantics_ir.h) is the output of semantic
-processing.
-
-The intent is that a `SemanticsIR` looks closer to a series of instructions than
-a tree. This is in order to better align with the LLVM IR structure which will
-be used for code generation.
-
-This phase should eventually include semantic checking of the SemanticsIR, but
-it's a work in progress.
-
-## Diagnostics
-
-### DiagnosticEmitter
-
-[DiagnosticEmitters](diagnostics/diagnostic_emitter.h) handle the main
-formatting of a message. It's parameterized on a location type, for which a
-`DiagnosticLocationTranslator` must be provided that can translate the location
-type into a standardized `DiagnosticLocation` of file, line, and column.
-
-When emitting, the resulting formatted message is passed to a
-`DiagnosticConsumer`.
-
-### DiagnosticConsumers
-
-`DiagnosticConsumers` handle output of diagnostic messages after they've been
-formatted by an `Emitter`. Important consumers are:
-
--   [ConsoleDiagnosticConsumer](diagnostics/diagnostic_emitter.h): prints
-    diagnostics to console.
--   [ErrorTrackingDiagnosticConsumer](diagnostics/diagnostic_emitter.h): counts
-    the number of errors produced, particularly so that it can be determined
-    whether any errors were encountered.
--   [SortingDiagnosticConsumer](diagnostics/sorting_diagnostic_consumer.h):
-    sorts diagnostics by line so that diagnostics are seen in terminal based on
-    their order in the file rather than the order they were produced.
--   [NullDiagnosticConsumer](diagnostics/null_diagnostics.h): suppresses
-    diagnostics, particularly for tests.
-
-### Producing diagnostics
-
-Diagnostics are used to surface issues from compilation. A simple diagnostic
-looks like:
-
-```cpp
-CARBON_DIAGNOSTIC(InvalidCode, Error, "Code is invalid");
-emitter.Emit(location, InvalidCode);
-```
-
-Here, `CARBON_DIAGNOSTIC` defines a static instance of a diagnostic named
-`InvalidCode` with the associated severity (`Error` or `Warning`).
-
-The `Emit` call produces a single instance of the diagnostic. When emitted,
-`"Code is invalid"` will be the message used. The type of `location` depends on
-the `DiagnosticEmitter`.
-
-A diagnostic with an argument looks like:
-
-```cpp
-CARBON_DIAGNOSTIC(InvalidCharacter, Error, "Invalid character `{0}`.", char);
-emitter.Emit(location, InvalidCharacter, invalid_char);
-```
-
-Here, the additional `char` argument to `CARBON_DIAGNOSTIC` specifies the type
-of an argument to expect for message formatting. The `invalid_char` argument to
-`Emit` provides the matching value. It's then passed along with the diagnostic
-message format to `llvm::formatv` in order to produce the final diagnostic
-message.
-
-#### Diagnostic registry
-
-There is a [registry](diagnostics/diagnostic_registry.def) which all diagnostics
-must be added to. Each diagnostic has a line like:
-
-```cpp
-CARBON_DIAGNOSTIC_KIND(InvalidCode)
-```
-
-This produces a central enumeration of all diagnostics. The eventual intent is
-to require tests for every diagnostic that can be produced, but that isn't
-currently implemented.
-
-#### `CARBON_DIAGNOSTIC` placement
-
-Idiomatically, `CARBON_DIAGNOSTIC` will be adjacent to the `Emit` call. However,
-this is only because many diagnostics can only be produced in one code location.
-If they can be produced in multiple locations, they will be at a higher scope so
-that multiple `Emit` calls can reference them. When in a function,
-`CARBON_DIAGNOSTIC` should be placed as close as possible to the usage so that
-it's easier to see the associated output.
-
-### Diagnostic context
-
-In the future, we'll want to provide additional context for errors. For example,
-if there's a function parameter mismatch, it may be useful to point both at the
-caller and function signature compared. However, at present the emitter only
-produces errors on one location. This is something that we need to consider
-further, and will probably involve further changes to diagnostic handling.
+A design is currently maintained in
+[Google Drive](https://docs.google.com/document/d/1RRYMm42osyqhI2LyjrjockYCutQ5dOf8Abu50kTrkX0/edit).
+It'll be migrated to markdown once we are confident in its stability.