Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

Zhuo Zhang

Last update: Dec 5, 2022

Related tags

Overview

StochFuzz: A New Solution for Binary-only Fuzzing

StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries. It is facilitated by a novel incremental and stochastic rewriting technique that is particularly suitable for binary-only fuzzing. Any AFL-based fuzzer, which takes edge coverage (defined by AFL) as runtime feedback, can acquire benefits from StochFuzz to directly fuzz stripped binaries.

More data and the results of the experiments can be found here. Example cases of leveraging StochFuzz to improve advanced AFL-based fuzzers (AFL++ and Polyglot) can be found in system.md.

Clarifications

We adopt a new system design than the one from the paper. Details can be found at system.md.
In the paper, when we are talking about e9patch, we are actually talking about the binary-only fuzzing tool built upon e9patch, namely e9tool. Please refer to its website for more details.
StochFuzz provides sound rewriting for binaries without inlined data, and probabilistically sound rewriting for the rest.

Building StochFuzz

StochFuzz is built upon Keystone, Capstone, GLib, and libunwind.

These dependences can be built by build.sh. If you are trying to build StochFuzz in a clean container, make sure some standard tools like autoreconf and libtool are installed.

$ git clone https://github.com/ZhangZhuoSJTU/StochFuzz.git
$ cd StochFuzz
$ ./build.sh

StochFuzz itself can be built by GNU Make.

$ cd src
$ make release

We have tested StochFuzz on Ubuntu 18.04. If you have any issue when running StochFuzz on other systems, please kindly let us know.

How to Use

StochFuzz provides multiple rewriting options, which follows the AFL's style of passing arguments.

$ ./stoch-fuzz -h
stoch-fuzz 1.0.0 by <[email protected]>

./stoch-fuzz [ options ] -- target_binary [ ... ]

Mode settings:

  -S            - start a background daemon and wait for a fuzzer to attach (defualt mode)
  -R            - dry run target_binary with given arguments without an attached fuzzer
  -P            - patch target_binary without incremental rewriting
  -D            - probabilistic disassembly without rewriting
  -V            - show currently observed breakpoints

Rewriting settings:

  -g            - trace previous PC
  -c            - count the number of basic blocks with conflicting hash values
  -d            - disable instrumentation optimization
  -r            - assume the return addresses are only used by RET instructions
  -e            - install the fork server at the entrypoint instead of the main function
  -f            - forcedly assume there is data interleaving with code
  -i            - ignore the call-fallthrough edges to defense RET-misusing obfuscation

Other stuff:

  -h            - print this help
  -x execs      - set the number of executions after which a checking run will be triggered
                  set it as zero to disable checking runs (default: 200000)
  -t msec       - set the timeout for each daemon-triggering execution
                  set it as zero to ignore the timeout (default: 2000 ms)
  -l level      - set the log level, including INFO, WARN, ERROR, and FATAL (default: INFO)

Basic Usage

- It is worth first trying the advanced strategy (see below) because that is much more cost-effective.

To fuzz a stripped binary, namely example.out, we need to cd to the directory of the target binary. For example, if the full path of example.out is /root/example.out, we need to first cd /root/. Furthermore, it is dangerous to run two StochFuzz instances under the same directory. These restrictions are caused by some design faults and we will try to relax them in the future.

Assuming StochFuzz is located at /root/StochFuzz/src/stoch-fuzz, execute the following command to start rewriting the target binary.

$ cd /root/
$ /root/StochFuzz/src/stoch-fuzz -- example.out # do not use ./example.out here

After the initial rewriting, we will get a phantom file named example.out.phantom. This phantom file can be directly fuzzed by AFL or any AFL-based fuzzer. Note that the StochFuzz process would not stop during fuzzing, so please make sure the process is alive during fuzzing.

Here is a demo that shows how StochFuzz works.

Advanced Usage

Compared with the compiler-based instrumentation (e.g., afl-clang-fast), StochFuzz has additional runtime overhead because it needs to emulate each CALL instruction to support stack unwinding.

Inspired by a recent work, we provide an advanced rewriting strategy where we do not emulate CALL instructions but wrap the _ULx86_64_step function from libunwind to support stack unwinding. This strategy works for most binaries but may fail in some cases like fuzzing statically linked binaries.

To enable such strategy, simply provide a -r option to StochFuzz.

$ cd /root/
$ /root/StochFuzz/src/stoch-fuzz -r -- example.out # do not use ./example.out here

Addtionally, before fuzzing, we need to prepare the AFL_PRELOAD environment variable for AFL.

$ export STOCHFUZZ_PRELOAD=$(/root/StochFuzz/scritps/stochfuzz_env.sh)
$ AFL_PRELOAD=$STOCHFUZZ_PRELOAD afl-fuzz -i seeds -o output -t 2000 -- example.out.phantom @@

Following demo shows how to apply this advanced strategy.

Troubleshootings

Common issues can be referred to trouble.md. If it cannot help solve your problem, please kindly open a Github issue.

Besides, we provide some tips on using StochFuzz, which can be found at tips.md

Development

Currently, we have many todo items. We present them in todo.md.

We also present many pending decisions which we are hesitating to take, in todo.md. If you have any thought/suggestion, do not hesitate to let us know. It would be very appreciated if you can help us improve StochFuzz.

StochFuzz should be considered an alpha-quality software and it is likely to contain bugs.

I will try my best to maintain StochFuzz timely, but sometimes it may take me more time to respond. Thanks for your understanding in advance.

Cite

Zhang, Zhuo, et al. "STOCHFUZZ: Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting." 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021.

References

Duck, Gregory J., Xiang Gao, and Abhik Roychoudhury. "Binary rewriting without control flow recovery." Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 2020.
Meng, Xiaozhu, and Weijie Liu. "Incremental CFG patching for binary rewriting." Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 2021.
Aschermann, Cornelius, et al. "Ijon: Exploring deep state spaces via fuzzing." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020.
Google. “Google/AFL.” GitHub, github.com/google/AFL.

You might also like...

[TPAMI 2021] iOD: Incremental Object Detection via Meta-Learning

Incremental Object Detection via Meta-Learning To appear in an upcoming issue of the IEEE Transactions on Pattern Analysis and Machine Intelligence (T

66 Jan 4, 2023

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

🆕 Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

3.6k Dec 26, 2022

The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

PyCIL: A Python Toolbox for Class-Incremental Learning Introduction • Methods Reproduced • Reproduced Results • How To Use • License • Acknowledgement

258 Dec 31, 2022

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

24 Nov 1, 2022

This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Official Pytorch Implementation for GLFC [CVPR-2022] Federated Class-Incremental Learning This is the official implementation code of our paper "Feder

57 Dec 27, 2022

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

34 Oct 26, 2022

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Comments

Issue in `glib` build

While trying to build dependencies using the build.sh, it turns out the glib build fails. Using the latest commit on master and the OS is Ubuntu 18 LTS.

opened by Silipwn 4
hello, when I ran AFL fuzz a demo program, the stoch-fuzz didn't respond.

hello, when I ran AFL fuzz a demo program, the stoch-fuzz didn't respond. It keep stopping at “phantom file is create, please execute afldemo.phantom to communicate with the daemon”

opened by bufferflyfly 2
bug in build.sh

I tried to install StochFuzz with build.sh, and the clang version in my environment is 10.0.0, but this script failed with message: "clang-6.0 or a newer version is required". So I checked this script and find that the regular expression that you used when matching with clang version output, is not correct if the version number is bigger than 9: clang --version | head -n 1 | grep -o -E "[[:digit:]].[[:digit:]].[[:digit:]]" | uniq | sort Maybe this is better： clang --version | head -n 1 | grep -o -E "[0-9]{1,2}.[0-9].[0-9]" | uniq | sort

opened by Nova-xiao 1

undefined reference to `sysconf'

OS: ubuntu 22 reproduce:

./build.sh
sudo apt install libunwind-dev libglib-dev
replace all libasan.so.4 with libasan.so.6 in src/Makefile
cd src && make release

error:

clang -Wall -fno-stack-protector -fno-jump-tables -fpie -O3 -D_GNU_SOURCE -DNDEBUG -c loader.c
clang -nostdlib -o loader.out loader.o -Wl,--entry=_entry
/usr/bin/ld: loader.o: in function `loader_load':
loader.c:(.text+0xb6): undefined reference to `sysconf'
/usr/bin/ld: loader.c:(.text+0xd8): undefined reference to `sysconf'
/usr/bin/ld: loader.c:(.text+0x11a): undefined reference to `sysconf'
/usr/bin/ld: loader.c:(.text+0x74e): undefined reference to `sysconf'
/usr/bin/ld: loader.c:(.text+0x75b): undefined reference to `sysconf'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:97: loader] Error 1

opened by syheliel 1

Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

Related tags

Overview

StochFuzz: A New Solution for Binary-only Fuzzing

Clarifications

Building StochFuzz

How to Use

Basic Usage

Advanced Usage

Troubleshootings

Development

Cite

References

You might also like...

[TPAMI 2021] iOD: Incremental Object Detection via Meta-Learning

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

Fuzzing the Kernel Using Unicornafl and AFL++

Comments

Issue in `glib` build

hello, when I ran AFL fuzz a demo program, the stoch-fuzz didn't respond.

bug in build.sh

undefined reference to `sysconf'

Owner

Zhuo Zhang

Fuzzing tool (TFuzz): a fuzzing tool based on program transformation

The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching（CVPR2021）

Implement some metaheuristics and cost functions

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

An open-source, low-cost, image-based weed detection device for fallow scenarios.

Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

Official repository of the paper 'Essentials for Class Incremental Learning'