Jon Ross-Perkins 16f86471e7 Shard the explorer fuzzer test (#2642) 3 tahun lalu
..
fuzzer_corpus 4fa71e32f5 Add support for most kinds of declarations to be declared and used as namespace members (#2575) 3 tahun lalu
BUILD 16f86471e7 Shard the explorer fuzzer test (#2642) 3 tahun lalu
README.md ccd5545ebf Fix "and" vs. "add" typo in fuzzing README (#1228) 4 tahun lalu
ast_to_proto.cpp 6037f9e622 Remove AlternativeValueBase, per review comments on #2605. (#2606) 3 tahun lalu
ast_to_proto.h 20728dbd3a CARBON_ header guards (#1261) 4 tahun lalu
ast_to_proto_test.cpp f6248a4b6f Manual clang-tidy fixes (#2319) 3 tahun lalu
explorer_fuzzer.cpp 4d522c8e90 Finish making clang-tidy (mostly) work (again) and run -fix (#2312) 3 tahun lalu
fuzzer_util.cpp 9df70fb115 Disable most tracing in the prelude. (#2616) 3 tahun lalu
fuzzer_util.h 98d95cd188 Cleanup: since explorer_fuzzer is now a standard cc_fuzz_test, and runs on all files in the corpus, there's no need for fuzzer_util_test to do the same manually 3 tahun lalu
fuzzer_util_test.cpp 4fa71e32f5 Add support for most kinds of declarations to be declared and used as namespace members (#2575) 3 tahun lalu
fuzzverter.cpp f6248a4b6f Manual clang-tidy fixes (#2319) 3 tahun lalu
proto_to_carbon_test.cpp f6248a4b6f Manual clang-tidy fixes (#2319) 3 tahun lalu
regen_corpus.py 1af8298580 Added a script for regenerating explorer fuzzer corpus (#1297) 3 tahun lalu

README.md

Explorer structured fuzzer

Overview

Fuzz testing is based on generating a large amount of random inputs for a software component in order to trigger bugs and unexpected behavior. Basic fuzzing uses randomly generated arrays of bytes as inputs, which works great for some applications but is problematic for testing the logic that operates on highly structured data, as most random inputs are immediately rejected as invalid before any interesting parts of the code get a chance to run.

Structured fuzzing addresses this issue by ensuring the randomly generated data is itself structured, and as such has a high chance of presenting a valid input.

explorer_fuzzer is a structured fuzzer based on libprotobuf-mutator, which is a library to randomly mutate protobuffers.

The input to the fuzzer is an instance of Carbon::Fuzzing::Carbon proto randomly generated by the libprotobuf-mutator framework. explorer_fuzzer converts the proto to a Carbon source code string, and tries to parse and execute the code using explorer implementation.

Fuzzer data format

libprotobuf-mutator supports fuzzer inputs in either text or binary protocol buffer format. explorer_fuzzer uses text proto format with Carbon proto message definition in common/fuzzing/carbon.proto.

Incorporating AST changes into the fuzzer

Fuzzer AST representation in carbon.proto needs to be updated when changes are made to the AST, like adding a new AST node classes or changing relevant data members of existing nodes.

There are two unit tests which normally should not require direct changes, as both tests work off of Carbon test files in testdata.

  • ast_to_proto_test.cpp is a 'smoke' test which verifies that each field of Carbon proto is populated at least once after converting all of test Carbon files and merging the results into a single protocol buffer.

  • proto_to_carbon_test.cpp uses a 'roundtrip' approach, by converting each parseable Carbon file to a proto representation, then back to Carbon source, parsing this source into a second instance of an AST, and comparing the second AST with the original AST using AST::Dump() method. The goal of the test is to ensure that carbon.proto is able to represent ASTs correctly without information loss.

To incorporate AST changes into fuzzing logic:

  1. Add appropriate AST information to carbon.proto. Use existing similar cases as examples.

  2. Add logic to populate the proto to ast_to_proto.cpp.

  3. Make sure ast_to_proto_test passes with the new changes.

  4. Modify proto_to_carbon.cpp which handles printing of a Carbon proto instance as a Carbon source string. For example, add code to print newly introduced proto fields.

  5. Make sure proto_to_carbon_test passes after the changes.

Running the fuzzer

The fuzzer can be run in 'unit test' mode, where the fuzzer executes on each input file from the fuzzer_corpus/ folder, or in 'fuzzing' mode, where the fuzzer will keep generating random inputs and executing the logic on them until a crash is triggered, or forever in a bug-free program ;).

To run in 'unit test' mode:

bazel test --config=proto-fuzzer --test_output=all //explorer/fuzzing:explorer_fuzzer

To run in 'fuzzing' mode:

bazel build --config=proto-fuzzer //explorer/fuzzing:explorer_fuzzer

bazel-bin/explorer/fuzzing/explorer_fuzzer

It's also possible to run the fuzzer on a single input:

bazel-bin/explorer/fuzzing/explorer_fuzzer /tmp/crash.textproto

Investigating a crash

To reproduce a crash, run the fuzzer on the crashing input as described above.

A separate tool called fuzzverter can be used for things like converting a crashing input to Carbon source code for running explorer on the code directly.

To convert a Fuzzing::Carbon text proto to Carbon source:

bazel-bin/explorer/fuzzing/fuzzverter --mode proto_to_carbon --input /tmp/crash.textproto

Generating new fuzzer corpus entries

The ability of the fuzzing framework to generate 'interesting' inputs can be improved by providing 'seed' inputs known as the fuzzer corpus. The inputs need to be a Fuzzing::Carbon text proto.

To generate a text proto from Carbon source:

bazel-bin/explorer/fuzzing/fuzzverter --mode carbon_to_proto --input /tmp/crash.carbon --output /tmp/crash.textproto