HypoFuzz: adaptive fuzzing of property-based test suites

This is my primary research project at the moment, with a side-goal of commercialisation. My initial literature review is at https://hypofuzz.com/docs/literature.html though not yet in paper format. This is a large enough project that I’m planning to split out part of the evaluation - assessing the contribution of various techniques to fuzzing performance in this structured setting - into a follow-on paper :doc:`augumenting`.

Abstract

Fuzzing has been terrifyingly effective in many domains, and has seen sigificant adoption among researchers and developers of security-critical native code. However, fuzzing is almost unknown among developers using higher-level languages like Python, perhaps due to the ‘impedence mismatch’ between bytestring-oriented greybox fuzzers and the structured input required by interpreted languages.

To bridge this gap, I present HypoFuzz: a hybrid fuzzer extending the well known Hypothesis library for property-based testing. HypoFuzz is designed to require no knowledge of or expertise in fuzzing to use, and integrates seamlessly into standard testing workflows for Python - including to reproduce minimal deduplicated failing examples.

I discuss the constraints and opportunities afforded by the highly structured setting of fuzzing property-based tests, and evaluate the user experience impact of XXXXXXXXX.


Note

The fragments below are not organised into a draft; they’re just the chunks of prose that were in my notes and not moved to the literature review at https://hypofuzz.com/docs/literature.html

For a detailed evaluation of how HypoFuzz performance is shaped by choice of techniques for seed selection, scheduling, mutation operators, coverage metrics, ensembling, additional feedbacks, and targeted fuzzing, see XXXXXXX the other paper.

Design notes for the best Python fuzzer

Hypothesis is a state-of-the-art tool for property-based testing, and has seen very wide adoption throughout the Python ecosystem - in open source and industry, and applied to fields from data analysis to web development to cutting-edge scientific research.

Hypothesis is downloaded over a million times every month, and used by at least four percent of all users surveyed by the Python Software Foundation for several years running.

Empowering our user community with a world-class fuzzing workflow is a fantastic opportunity for real-world impact, but will take more than just writing powerful tools. Performance in expert hands is an important part of user experience, but so are installation, documentation, config complexity, tutorials, operating system support, integration with other tools, and so on. Fortunately, this plays to Hypothesis’ strengths as an open source community even if it is alien to most research projects!

This document describes:

  • Hypothesis as a library for writing fuzzer harnesses

  • A selected genealogy of modern fuzzing tools and concepts

  • Some options to improve development workflows

  • Proposed design, some implementation ideas, and open problems

For literature review, see https://hypofuzz.com/docs/literature.html

Some options to improve development workflows

  • fuzzing across very many targets - scheduling, compute allocation

  • adaptive scheduling for ensemble fuzzers

  • directing fuzzing effort at recent patches (i.e. exploiting VCS history)

  • integrated replay and shrinking with standard PBT workflow (done!)

  • sharing state / adaptive params across runs, for ‘live at head’ fuzzing

  • can we track a minimal corpus to go from cold start to adapted ASAP?

  • must be zero configuration. ask users to suggest heuristics/conventions instead of configure their instance.

  • crash-only software, thanks. Supervisor can kill-and-restart on each commit.