FairFuzz: AFL extension targeting rare branches

Related tags

Deep Learning afl-rb
Overview

FairFuzz

An AFL extension to increase code coverage by targeting rare branches. FairFuzz has a particular advantage on programs with highly nested structure (packet analyzers, xmllint, programs compiled with laf-inte, etc). AFL is written and maintained by Michal Zalewski [email protected]; FairFuzz extension by Caroline Lemieux [email protected].

Our paper on FairFuzz was published in ASE 2018.

Summary

This is a modified version of AFL which changes (1) the way in which AFL selects input for mutation and (2) the way in which AFL mutates inputs to target rare "branches" in the program under test. AFL keeps track of program coverage an input achieves by counting the number of times the input hits a Basic Block Transition (see AFL technical details for more detail). This basic block transition can be loosely associated with a branch in the control flow graph, thus we call refer to these transitions as branches.

On many benchmarks, this modification achieves faster branch coverage than AFL or AFLFast. The advantage seems particularly strong on programs structured with many nested conditional statements. The graphs below show the number of basic block transitions covered over 24 hours on 9 different benchmarks. Each run was repeated 20 times (one fuzzer only): the dark middle line is the average coverage achieved, and the bands represent 95% confidence intervals.

24 hour runs on benchmarks

Evaluation was conducted on on AFL 2.40b. All runs were done with no AFL dictionary: the emphasis on rare branches appears to yield better automated discovery of special sequences. On the tcpdump and xmllint benchmarks, FairFuzz was able to discover rare sequences over the 20 runs, which the other techniques did not discover. Over these 20 techniques each different technique, the number of runs finding portions of the sequence <!ATTLIST after 24 hours was:

subsequence AFL FidgetyAFL AFLFast.new FairFuzz
<!A 7 15 18 17
<!AT 1 2 3 11
<!ATT 1 0 0 1

More details in article.

Technical Summary

What is a rare branch?

We consider a branch to be rare if it is hit by fewer inputs than the other branches explored so far. We maintain a dynamic threshold for rarity: if a branch is hit by fewer inputs than (or exactly as many as) this threshold, it is rare. Precisely, the threshold is the lowest power of two bounding the number of inputs hitting the least-hit branch. For example, if the branch hit by the fewest inputs is hit by 19 inputs, the threshold will be 32, so any branch hit by ≤32 inputs will be rare. Once the rarest branch is hit by 33 inputs, the threshold will go up to 64.

To keep track of rarity information, after running every input, we increment a hit count for each branch.

How are inputs hitting rare branches selected?

When going through the queue to select inputs to mutate, FairFuzz selects inputs only if -- at the time of selection -- they hit a rare branch. The rare branch an input hits is selected as the target branch. If an input hits multiple rare branches, FairFuzz selects the rarest one (the one hit by the fewest inputs) as the target branch. If an input hits no rare branches, it is skipped (see the -q option below).

How are mutations targeted to a branch?

Once an input is selected from the queue, along with its designated target branch, FairFuzz computes a branch mask for the target branch. For every position in the input, the branch mask designates whether

  1. the byte at that position can be overwritten (overwritable),
  2. the byte at that position can be deleted (deleteable), or
  3. a byte can be inserted at that position (insertable),

while still hitting the target branch. FairFuzz computes an approximation to this branch mask dynamically, in a way similar to AFL's effector map. FairFuzz goes through the input, and at each position flips the byte, deletes the byte, or inserts a byte, then runs the mutated input to check whether it hits the target branch. If the mutated input hits the target branch, the position is marked as overwritable, deleteable, or insertable, respectively.

The branch mask is then used to influence mutations as follows:

  1. In the deterministic stages1, a mutation is only performed at a location if the branch mask designates the position as overwritable (or insertable, for dictionary element insert). For multi-byte modifications, a modification is only performed if all bytes are overwritable. The mask is used in conjunction with the effector map when the effector map is used.
  2. In the havoc stage, positions for mutations are randomly selected within the modifiable positions of the branch mask. For example, if bytes 5-10 and 13-20 in a 20-byte input can be overwritten without failing to hit the target branch, when randomly selecting a byte to randomly mutate, FairFuzz will randomly select a position in [5,6,7,8,9,10,13,...,20] to mutate. After a havoc mutation that deletes (resp. adds) a sequence of bytes in the input, FairFuzz deletes the sequence (resp. adds an "all modifiable" sequence) at the corresponding location in the branch mask.2 The branch mask becomes approximate after this point.

1 The mask is not used in the bitflipping stage since this would interfere with AFL's automatic detection of dictionary elements.

2 The mask is not used to determine where to splice inputs in the splicing stage: during splicing, the first part of the branch mask is kept, but the spliced half of the input is marked as all modifiable.

Usage summary

For basic AFL usage, see the README in docs/ (the one with no .md extension). There are four FairFuzz Rare Branches-specific options:

Running options (may be useful for functionality):

  • -r adds an additional trimming stage before mutating inputs. This trimming is more aggressive than AFL's, trimming the input down only according to the target branch -- the resulting trimmed input may have a different path than the original input, but will still hit the target branch. Recommended for use if there are very large seed files and deterministic mutations are being run.
  • -q num bootstraps the rare branch input selection from the queue with the regular AFL input selection mechanism. If after an entire cycle through the queue, no new branches are discovered, bootstrap according to num as follows:
    • -q 1 go back to regular AFL queueing + no branch mask on mutations until a new branch is found
    • -q 2 go back to regular AFL queueing + no branch mask on mutations + no deterministic mutations until a new branch is found
    • -q 3 go back to regular AFL queueing + no branch mask on mutations for a single queueing cycle

Evaluation options (mostly useful for comparing AFL versions):

  • -b disables the branch mask. (sets every position in the mask as modifiable -- will incur unnecessary slowdown compared to AFL)
  • -s runs a "shadow" mutation run before the branch-mask enabled run. Side effects are disabled in this run. This allows for direct comparison of the effect of the branch mask on new coverage discovered/number of inputs hitting the target branch. See min-branch-fuzzing.log produced in the AFL output directory for details.
Comments
  • Decrease of the own finds counter

    Decrease of the own finds counter

    I just noticed the own finds counter decreasing several times (e.g. jumping from about 34 to 20) while fuzzing with FairFuzz. I can't offhand remember whether that can ever happen in vanilla AFL (nor why it would make sense). I don't have time to investigate now, but I don't remember ever seeing it before, so it might be a FairFuzz bug.

    Update: Actually, I just remembered I was running it with the -s option, which sounds like a possible culprit.

    opened by dkasak 4
  • Is there a way to run traditional AFL with afl-rb?

    Is there a way to run traditional AFL with afl-rb?

    I'm trying to benchmark some fuzzing runs with afl-rb, and traditional AFL (with the time penalty incurred by the branching code). As such, I was wondering if there was a way to get standard AFL behavior with AFL-rb.

    It seems to me the -b option just turns off the masking... is there a feature that turns off the seed selection from the queue? If not, where might I add that code?

    Thanks!

    opened by siddk 3
  • Full blacklist

    Full blacklist

    I managed to stumble onto Too many things on the blacklist while fuzzing. I can see from the comment in the code that you're already aware of it, but I wasn't sure whether you've ever managed to hit it in practice, so just letting you know.

    opened by dkasak 1
  • Some memory allocation changes.

    Some memory allocation changes.

    • Prefer realloc to free + malloc and check the returned pointer for failure.
    • Prefer the ck_* family of memory management functions to the ones in the standard library.
    • Use syscalls for reading and writing files instead of stdio.h equivalents for consistency.
    • Fix bug due to ck_free calls being duplicated.
    opened by dkasak 0
  • Fix potential use of uninitialized variable.

    Fix potential use of uninitialized variable.

    It appears in some cases control might jump to the end of fuzz_one via the abandon_entry label, where it gets freed, prior to position_map initialization. Setting it to NULL turns this into a harmless free(NULL). Might manifest as a double free in some cases, probably due to reuse of old stack space in a subsequent call.

    I catched this by compiling afl with ASAN, which gave the following report when it crashed:

    ==8432==ERROR: AddressSanitizer: attempting double-free on 0x9951d440 in thread T0:
        #0 0x56705605 in __interceptor_free.localalias.0 (/home/dkasak/code/projects/afl-rb/afl-fuzz+0xe6605)
        #1 0x5676f5d5 in fuzz_one /home/dkasak/code/projects/afl-rb/afl-fuzz.c:7595:3
        #2 0x5674631b in main /home/dkasak/code/projects/afl-rb/afl-fuzz.c:9068:20
        #3 0xf74157c2 in __libc_start_main (/usr/lib32/libc.so.6+0x187c2)
        #4 0x5663e0a5 in _start (/home/dkasak/code/projects/afl-rb/afl-fuzz+0x1f0a5)
    
    0x9951d440 is located 0 bytes inside of 448-byte region [0x9951d440,0x9951d600)
    freed by thread T0 here:
        #0 0x56705605 in __interceptor_free.localalias.0 (/home/dkasak/code/projects/afl-rb/afl-fuzz+0xe6605)
        #1 0x5676f5d5 in fuzz_one /home/dkasak/code/projects/afl-rb/afl-fuzz.c:7595:3
        #2 0x5674631b in main /home/dkasak/code/projects/afl-rb/afl-fuzz.c:9068:20
        #3 0xf74157c2 in __libc_start_main (/usr/lib32/libc.so.6+0x187c2)
    
    previously allocated by thread T0 here:
        #0 0x567057cd in malloc (/home/dkasak/code/projects/afl-rb/afl-fuzz+0xe67cd)
        #1 0x5676dd09 in fuzz_one /home/dkasak/code/projects/afl-rb/afl-fuzz.c:7420:22
        #2 0x5674631b in main /home/dkasak/code/projects/afl-rb/afl-fuzz.c:9068:20
        #3 0xf74157c2 in __libc_start_main (/usr/lib32/libc.so.6+0x187c2)
    
    opened by dkasak 0
  • Relax exclude list condition

    Relax exclude list condition

    Currently a branch is added to the exclude list (blacklist) if the branch mask marks it as completely unmodifiable. Fuzzing continues since we allow insertion at the end of the input regardless. This is contrary to what was documented in the paper.

    In a toy program I was testing on this meant that some branches were incorrectly added to the exclude list (say none of the points of the input could be modified without changing the branch, i.e. if delta-debugging was done successfully), when they were in fact hittable by modification.

    Consider changing this to be what was described in the paper: add to exclude list only if NONE of the mutated inputs hit the branch in question.

    opened by carolemieux 0
  • Fix stalling when no inputs hit rare branch of interest

    Fix stalling when no inputs hit rare branch of interest

    One one run of xmllint, the rare branches queueing mechanism completely stalled, as if no inputs hit the rare branch of interest. This is strange since every input which hits a new branch is saved to the queue (increment_hit_bits is only called in save_if_interesting). Need to reproduce + investigate.

    opened by carolemieux 3
Owner
Caroline Lemieux
Caroline Lemieux
Syntax-Aware Action Targeting for Video Captioning

Syntax-Aware Action Targeting for Video Captioning Code for SAAT from "Syntax-Aware Action Targeting for Video Captioning" (Accepted to CVPR 2020). Th

null 59 Oct 13, 2022
SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis Pretrained Models In this work, we created synthetic tissue

Emirhan Kurtuluş 1 Feb 7, 2022
An AFL implementation with UnTracer (our coverage-guided tracer)

UnTracer-AFL This repository contains an implementation of our prototype coverage-guided tracing framework UnTracer in the popular coverage-guided fuz

null 113 Dec 17, 2022
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

null 356 Dec 23, 2022
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022
Driller: augmenting AFL with symbolic execution!

Driller Driller is an implementation of the driller paper. This implementation was built on top of AFL with angr being used as a symbolic tracer. Dril

Shellphish 791 Jan 6, 2023
IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

Chair for Sys­tems Se­cu­ri­ty 146 Dec 16, 2022
Directed Greybox Fuzzing with AFL

AFLGo: Directed Greybox Fuzzing AFLGo is an extension of American Fuzzy Lop (AFL). Given a set of target locations (e.g., folder/file.c:582), AFLGo ge

null 380 Nov 24, 2022
MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

null 172 Dec 18, 2022
AFLFast (extends AFL with Power Schedules)

AFLFast Power schedules implemented by Marcel Böhme <[email protected]>. AFLFast is an extension of AFL which is written and maintained by Michal

Marcel Böhme 380 Jan 3, 2023
AFL binary instrumentation

E9AFL --- Binary AFL E9AFL inserts American Fuzzy Lop (AFL) instrumentation into x86_64 Linux binaries. This allows binaries to be fuzzed without the

null 242 Dec 12, 2022
A Temporal Extension Library for PyTorch Geometric

Documentation | External Resources | Datasets PyTorch Geometric Temporal is a temporal (dynamic) extension library for PyTorch Geometric. The library

Benedek Rozemberczki 1.9k Jan 7, 2023
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2020 Links Doc

Sebastian Raschka 4.2k Jan 2, 2023
Geometric Deep Learning Extension Library for PyTorch

Documentation | Paper | Colab Notebooks | External Resources | OGB Examples PyTorch Geometric (PyG) is a geometric deep learning extension library for

Matthias Fey 16.5k Jan 8, 2023
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

NVIDIA Corporation 6.9k Jan 3, 2023
Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

DiffSinger - PyTorch Implementation PyTorch implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension). Status

Keon Lee 152 Jan 2, 2023
DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

DeepProbLog DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predic

KU Leuven Machine Learning Research Group 94 Dec 18, 2022