# Executable semantics structured fuzzer ## Overview Fuzz testing is based on generating a large amount of random inputs for a software component in order to trigger bugs and unexpected behavior. Basic fuzzing uses randomly generated arrays of bytes as inputs, which works great for some applications but is problematic for testing the logic that operates on highly structured data, as most random inputs are immediately rejected as invalid before any interesting parts of the code get a chance to run. Structured fuzzing addresses this issue by ensuring the randomly generated data is itself structured, and as such has a high chance of presenting a valid input. `executable_semantics_fuzzer` is a structured fuzzer based on [libprotobuf-mutator](https://github.com/google/libprotobuf-mutator), which is a library to randomly mutate [protobuffers](https://github.com/protocolbuffers/protobuf). The input to the fuzzer is an instance of `Carbon::Fuzzing::Carbon` proto randomly generated by the `libprotobuf-mutator` framework. `executable_semantics_fuzzer` converts the proto to a Carbon source code string, and tries to parse and execute the code using `executable_semantics` implementation. ## Fuzzer data format `libprotobuf-mutator` supports fuzzer inputs in either text or binary protocol buffer format. `executable_semantics_fuzzer` uses text proto format with `Carbon` proto message definition in `common/fuzzing/carbon.proto`. ## Running the fuzzer The fuzzer can be run in 'unit test' mode, where the fuzzer executes on each input file from the `fuzzer_corpus/` folder, or in 'fuzzing' mode, where the fuzzer will keep generating random inputs and executing the logic on them until a crash is triggered, or forever in a bug-free program ;). To run in 'unit test' mode: ```bash bazel test --config=proto-fuzzer --test_output=all //executable_semantics/fuzzing:executable_semantics_fuzzer ``` To run in 'fuzzing' mode: ```bash bazel build --config=proto-fuzzer //executable_semantics/fuzzing:executable_semantics_fuzzer bazel-bin/executable_semantics/fuzzing/executable_semantics_fuzzer ``` It's also possible to run the fuzzer on a single input: ```bash bazel-bin/executable_semantics/fuzzing/executable_semantics_fuzzer /tmp/crash.textproto ``` ## Investigating a crash To reproduce a crash, run the fuzzer on the crashing input as described above. A separate tool called `fuzzverter` can be used for things like converting a crashing input to Carbon source code for running `executable_semantics` on the code directly. To convert a `Fuzzing::Carbon` text proto to Carbon source: ```bash bazel-bin/executable_semantics/fuzzing/fuzzverter --mode proto_to_carbon --input /tmp/crash.textproto ``` ## Generating new fuzzer corpus entries The ability of the fuzzing framework to generate 'interesting' inputs can be improved by providing 'seed' inputs known as the fuzzer corpus. The inputs need to be a `Fuzzing::Carbon` text proto. To generate a text proto from Carbon source: ```bash bazel-bin/executable_semantics/fuzzing/fuzzverter --mode carbon_to_proto --input /tmp/crash.carbon --output /tmp/crash.textproto ```