Explorar el Código

Improve correctness of our Clang tooling infrastructure. (#392)

This restructures the `compile_flags.txt` to use the downloaded libc++ system
headers and avoid needing a virtual include directory to be built. It still
needs _some_ Bazel build to complete before working in order to have the libc++
system headers downloaded and the symlink to the Bazel tree created.

One (very) tricky part of making this work is to work around bugs in Clang's
tooling layer that incorrectly handle `..` path components after traversing
symlinks. To avoid this, we add a custom symlinks (`bazel-execroot` and
`bazel-clang-toolchain`) that hide the relevant traversal of the Bazel layout to
find build artifacts and the downloaded toolchain. These symlinks will be broken
until a build with Bazel downloads the toolchain and creates the basic output
tree structure.

It also adds a `create_compdb.py` script. Running this script improves the
tooling fidelity by taking a few steps:

1. It queries Bazel to find all the relevant files and adds them to a
   `compile_commands.json` database that allows `clangd` and other tools to
   index the entire project for improved cross-references, etc.
2. It builds all the generated files with Bazel so that they can be included
   successfully. This is very fast in my testing, taking only 10s of seconds. It
   is also very likely to be cached effectively.
3. It translates the arguments from `compile_flags.txt` to make them
   persistently use the built generated files include paths so that nothing
   breaks even as different targets are built potentially with different
   configurations.

There are still some limitations.

- It still requires running Bazel before anything works, even if a fast run.
- It will require re-running if new generated files are added and needed but not
  built.
- It assumes that the standard Bazel symlink names are used and available.

Much of the Python here was written by @geoffromer in #384 -- I've adapted it
here after discussing to try to fill in some of the blanks and use a slightly
different approach to querying Bazel. I use the normal `bazel query` rather than
`bazel aquery`. This, for example, allows the index to reliably cover header
files in header-only libraries more directly (rather than relying on transitive
inclusion). It also seems a bit simpler too parse, but that is a pretty minor
difference.

Co-authored-by: Geoffrey Romer <gromer@google.com>
Chandler Carruth hace 5 años
padre
commit
e440b07eb3
Se han modificado 7 ficheros con 178 adiciones y 10 borrados
  1. 6 0
      .gitignore
  2. 1 1
      .pre-commit-config.yaml
  3. 1 0
      bazel-clang-toolchain
  4. 1 0
      bazel-execroot
  5. 11 8
      compile_flags.txt
  6. 156 0
      scripts/create_compdb.py
  7. 2 1
      setup.cfg

+ 6 - 0
.gitignore

@@ -10,3 +10,9 @@
 
 # VSCode creates this directory in ways that are hard to prevent.
 /.vscode/
+
+# Directories created by clangd
+/.cache/clangd
+
+# Compilation database used by clangd
+compile_commands.json

+ 1 - 1
.pre-commit-config.yaml

@@ -17,7 +17,7 @@ repos:
       - id: check-executables-have-shebangs
       - id: check-merge-conflict
       - id: check-symlinks
-        exclude: '^website/jekyll/site/_includes$'
+        exclude: '^(website/jekyll/site/_includes|bazel-(clang-toolchain|execroot))$'
       - id: check-yaml
       - id: detect-private-key
       - id: end-of-file-fixer

+ 1 - 0
bazel-clang-toolchain

@@ -0,0 +1 @@
+bazel-out/../../../external/bootstrap_clang_toolchain

+ 1 - 0
bazel-execroot

@@ -0,0 +1 @@
+bazel-out/../../carbon

+ 11 - 8
compile_flags.txt

@@ -43,35 +43,38 @@
 -iquote
 bazel-bin
 -iquote
-bazel-carbon-lang/external/llvm-project
+bazel-execroot/external/llvm-project
 -iquote
 bazel-bin/external/llvm-project
 -iquote
-bazel-carbon-lang/external/llvm_terminfo
+bazel-execroot/external/llvm_terminfo
 -iquote
 bazel-bin/external/llvm_terminfo
 -iquote
-bazel-carbon-lang/external/llvm_zlib
+bazel-execroot/external/llvm_zlib
 -iquote
 bazel-bin/external/llvm_zlib
 -iquote
-bazel-carbon-lang/external/bazel_tools
+bazel-execroot/external/bazel_tools
 -iquote
 bazel-bin/external/bazel_tools
--Ibazel-bin/external/llvm-project/llvm/_virtual_includes/gtest_internal_headers
+-Ibazel-execroot/external/llvm-project/llvm/utils/unittest/googletest/src
 -isystem
-bazel-carbon-lang/external/llvm-project/llvm/include
+bazel-execroot/external/llvm-project/llvm/include
 -isystem
 bazel-bin/external/llvm-project/llvm/include
 -isystem
-bazel-carbon-lang/external/llvm-project/llvm/utils/unittest/googlemock/include
+bazel-execroot/external/llvm-project/llvm/utils/unittest/googlemock/include
 -isystem
 bazel-bin/external/llvm-project/llvm/utils/unittest/googlemock/include
 -isystem
-bazel-carbon-lang/external/llvm-project/llvm/utils/unittest/googletest/include
+bazel-execroot/external/llvm-project/llvm/utils/unittest/googletest/include
 -isystem
 bazel-bin/external/llvm-project/llvm/utils/unittest/googletest/include
 -std=c++17
+-nostdinc++
+-isystem
+bazel-clang-toolchain/include/c++/v1
 -no-canonical-prefixes
 -Wno-builtin-macro-redefined
 -D__DATE__="redacted"

+ 156 - 0
scripts/create_compdb.py

@@ -0,0 +1,156 @@
+#!/usr/bin/env python3
+
+"""Create a compilation database for Clang tools like `clangd`.
+
+If you want `clangd` to be able to index this project, run this script from
+the workspace root to generate a rich compilation database. After the first
+run, you should only need to run it if you encounter `clangd` problems, or if
+you want `clangd` to build an up-to-date index of the entire project. Note
+that in the latter case you may need to manually clear and rebuild clangd's
+index after running this script.
+
+Note that this script will build generated files in the Carbon project and
+otherwise touch the Bazel build. It works to do the minimum amount necessary.
+Once setup, generally subsequent builds, even of small parts of the project,
+different configurations, or that hit errors won't disrupt things. But, if
+you do hit errors, you can get things back to a good state by fixing the
+build of generated files and re-running this script.
+"""
+
+__copyright__ = """
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+"""
+
+import json
+import os
+import re
+import shutil
+import subprocess
+import sys
+from pathlib import Path
+
+# Change the working directory to the repository root so that the remaining
+# operations reliably operate relative to that root.
+os.chdir(Path(__file__).parent.parent)
+directory = Path.cwd()
+
+# We use the `BAZEL` environment variable if present. If not, then we try to
+# use `bazelisk` and then `bazel`.
+bazel = os.environ.get("BAZEL")
+if not bazel:
+    bazel = "bazelisk"
+    if not shutil.which(bazel):
+        bazel = "bazel"
+        if not shutil.which(bazel):
+            sys.exit("Unable to run Bazel")
+
+# Load compiler flags. We do this first in order to fail fast if not run from
+# the workspace root.
+print("Reading the arguments to use...")
+try:
+    with open("compile_flags.txt") as flag_file:
+        arguments = [line.strip() for line in flag_file]
+except FileNotFoundError:
+    sys.exit(Path(sys.argv[0]).name + " must be run from the project root")
+
+# Prepend the `clang` executable path to the arguments that looks into our
+# downloaded Clang toolchain.
+arguments = [str(Path("bazel-clang-toolchain/bin/clang"))] + arguments
+
+print("Building compilation database...")
+
+# Find all of the C++ source files that we expect to compile cleanly as
+# stand-alone files. This is a bit simpler than scraping the actual compile
+# actions and allows us to directly index header-only libraries easily and
+# pro-actively index the specific headers in the project.
+source_files_query = subprocess.run(
+    [
+        bazel,
+        "query",
+        "--keep_going",
+        "--output=location",
+        'filter(".*\\.(h|cpp|cc|c|cxx)$", kind("source file", deps(//...)))',
+    ],
+    check=True,
+    stdout=subprocess.PIPE,
+    stderr=subprocess.DEVNULL,
+    universal_newlines=True,
+).stdout
+source_files = [
+    Path(line.split(":")[0]) for line in source_files_query.splitlines()
+]
+
+# Filter into the Carbon source files that we'll find directly in the
+# workspace, and LLVM source files that need to be mapped through the merged
+# LLVM tree in Bazel's execution root.
+carbon_files = [
+    f.relative_to(directory)
+    for f in source_files
+    if f.parts[: len(directory.parts)] == directory.parts
+]
+llvm_files = [
+    Path("bazel-execroot/external").joinpath(
+        *f.parts[f.parts.index("llvm-project") :]
+    )
+    for f in source_files
+    if "llvm-project" in f.parts
+]
+print(
+    "Found %d Carbon source files and %d LLVM source files..."
+    % (len(carbon_files), len(llvm_files))
+)
+
+# Now collect the generated file labels.
+generated_file_labels = subprocess.run(
+    [
+        bazel,
+        "query",
+        "--keep_going",
+        "--output=label",
+        (
+            'filter(".*\\.(h|cpp|cc|c|cxx|def|inc)$",'
+            'kind("generated file", deps(//...)))'
+        ),
+    ],
+    check=True,
+    stdout=subprocess.PIPE,
+    stderr=subprocess.DEVNULL,
+    universal_newlines=True,
+).stdout.splitlines()
+print("Found %d generated files..." % (len(generated_file_labels),))
+
+# Directly build these labels so that indexing can find them. Allow this to
+# fail in case there are build errors in the client, and just warn the user
+# that they may be missing generated files.
+print("Building the generated files so that tools can find them...")
+subprocess.run([bazel, "build", "--keep_going"] + generated_file_labels)
+
+
+# Manually translate the label to a user friendly path into the Bazel output
+# symlinks.
+def _label_to_path(s):
+    # Map external repositories to their part of the output tree.
+    s = re.sub(r"^@([^/]+)//", r"bazel-bin/external/\1/", s)
+    # Map this repository to the root of the output tree.
+    s = s if not s.startswith("//") else "bazel-bin/" + s[len("//") :]
+    # Replace the colon used to mark the package name with a slash.
+    s = s.replace(":", "/")
+    # Convert to a native path.
+    return Path(s)
+
+
+generated_files = [_label_to_path(label) for label in generated_file_labels]
+
+# Generate compile_commands.json with an entry for each C++ input.
+entries = [
+    {
+        "directory": str(directory),
+        "file": str(f),
+        "arguments": arguments + [str(f)],
+    }
+    for f in carbon_files + llvm_files + generated_files
+]
+with open("compile_commands.json", "w") as json_file:
+    json.dump(entries, json_file, indent=2)

+ 2 - 1
setup.cfg

@@ -5,6 +5,7 @@
 [flake8]
 max-line-length = 80
 exclude = website/jekyll/build
+# E203: This warning is not PEP 8 compliant.
 # E402: Allow the pythonpath modifications before repo-local imports.
 # W503: flake8 v3.8.4 is inconsistent with black v20.8b1 (pre-commit run -a).
-ignore = E402,W503
+ignore = E203,E402,W503