Fuzzing the Kernel Using Unicornafl and AFL++

Overview

Unicorefuzz

Build Status code-style: black

Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19.

Is it any good?

yes.

AFL Screenshot

Unicorefuzz Setup

  • Install python2 & python3 (ucf uses python3, however qemu/unicorn needs python2 to build)
  • Run ./setup.sh, preferrably inside a Virtualenv (else python deps will be installed using --user). During install, afl++ and uDdbg as well as python deps will be pulled and installed.
  • Enjoy ucf

Upgrading

When upgrading from an early version of ucf:

  • Unicorefuzz will notify you of config changes and new options automatically.
  • Alternatively, run ucf spec to output a commented config.py spec-like element.
  • probe_wrapper.py is now ucf attach.
  • harness.py is now named ucf emu.
  • The song remains the same.

Debug Kernel Setup (Skip this if you know how this works)

  • Create a qemu-img and install your preferred OS on there through qemu
  • An easy way to get a working userspace up and running in QEMU is to follow the steps described by syzkaller, namely create-image.sh
  • For kernel customization you might want to clone your preferred kernel version and compile it on the host. This way you can also compile your own kernel modules (e.g. example_module).
  • In order to find out the address of a loaded module in the guest OS you can use cat /proc/modules to find out the base address of the module location. Use this as the offset for the function where you want to break. If you specify MODULE and BREAK_OFFSET in the config.py, it should use ./get_mod_addr.sh to start it automated.
  • You can compile the kernel with debug info. When you have compiled the linux kernel you can start gdb from the kernel folder with gdb vmlinux. After having loaded other modules you can use the lx-symbols command in gdb to load the symbols for the other modules (make sure the .ko files of the modules are in your kernel folder). This way you can just use something like break function_to_break to set breakpoints for the required functions.
  • In order to compile a custom kernel for Arch, download the current Arch kernel and set the .config to the Arch default. Then set DEBUG_KERNEL=y, DEBUG_INFO=y, GDB_SCRIPTS=y (for convenience), KASAN=y, KASAN_EXTRA=y. For convenience, we added a working example_config that can be place to the linux dir.
  • To only get necessary kernel modules boot the current system and execute lsmod > mylsmod and copy the mylsmod file to your host system into the linux kernel folder that you downloaded. Then you can use make LSMOD=mylsmod localmodconfig to only make the kernel modules that are actually needed by the guest system. Then you can compile the kernel like normal with make. Then mount the guest file system to /mnt and use make modules_install INSTALL_MOD_PATH=/mnt. At last you have to create a new initramfs, which apparently has to be done on the guest system. Here use mkinitcpio -k <folder in /lib/modules/...> -g <where to put initramfs>. Then you just need to copy that back to the host and let qemu know where your kernel and the initramfs are located.
  • Setting breakpoints anywhere else is possible. For this, set BREAKADDR in the config.py instead.
  • For fancy debugging, ucf uses uDdbg
  • Before fuzzing, run sudo ./setaflops.sh to initialize your system for fuzzing.

Run

  • ensure a target gdbserver is reachable, for example via ./startvm.sh
  • adapt config.py:
    • provide the target's gdbserver network address in the config to the probe wrapper
    • provide the target's target function to the probe wrapper and harness
    • make the harness put AFL's input to the desired memory location by adopting the place_input func config.py
    • add all EXITs
  • start ucf attach, it will (try to) connect to gdb.
  • make the target execute the target function (by using it inside the vm)
  • after the breakpoint was hit, run ucf fuzz. Make sure afl++ is in the PATH. (Use ./resumeafl.sh to resume using the same input folder)

Putting afl's input to the correct location must be coded invididually for most targets. However with modern binary analysis frameworks like IDA or Ghidra it's possible to find the desired location's address.

The following place_input method places at the data section of sk_buff in key_extract:

    # read input into param xyz here:
    rdx = uc.reg_read(UC_X86_REG_RDX)
    utils.map_page(uc, rdx) # ensure sk_buf is mapped
    bufferPtr = struct.unpack("<Q",uc.mem_read(rdx + 0xd8, 8))[0]
    utils.map_page(uc, bufferPtr) # ensure the buffer is mapped
    uc.mem_write(rdx, input) # insert afl input
    uc.mem_write(rdx + 0xc4, b"\xdc\x05") # fix tail

QEMUing the Kernel

A few general pointers. When using ./startvm.sh, the VM can be debugged via gdb. Use

$gdb
>file ./linux/vmlinux
>target remote :1234

This dynamic method makes it rather easy to find out breakpoints and that can then be fed to config.py. On top, startvm.sh will forward port 22 (ssh) to 8022 - you can use it to ssh into the VM. This makes it easier to interact with it.

Debugging

You can step through the code, starting at the breakpoint, with any given input. The fancy debugging makes use of uDdbg. To do so, run ucf emu -d $inputfile. Possible inputs to the harness (the thing wrapping afl-unicorn) that help debugging:

-d flag loads the target inside the unicorn debugger (uDdbg) -t flag enables the afl-unicorn tracer. It prints every emulated instruction, as well as displays memory accesses.

Gotchas

A few things to consider.

FS_BASE and GS_BASE

Unicorn did not offer a way to directly set model specific registers directly. The forked unicornafl version of AFL++ finally supports it. Most ugly code of earlier versions was scrapped.

Improve Fuzzing Speed

Right now, the Unicorefuzz ucf attach harness might need to be manually restarted after an amount of pages has been allocated. Allocated pages should propagate back to the forkserver parent automatically but might still get reloaded from disk for each iteration.

IO/Printthings

It's generally a good idea to nop out kprintf or kernel printing functionality if possible, when the program is loaded into the emulator.

Troubleshooting

If you got trouble running unicorefuzz, follow these rulse, worst case feel free to reach out to us, for example to @domenuk on twitter. For some notes on debugging and developing ucf and afl-unicorn further, read DEVELOPMENT.md

Just won't start

Run the harness without afl (ucf emu -t ./sometestcase). Make sure you are not in a virtualenv or in the correct one. If this works but it still crashes in AFL, set AFL_DEBUG_CHILD_OUTPUT=1 to see some harness output while fuzzing.

All testcases time out

Make sure ucf attach is running, in the same folder, and breakpoint has been triggered.

Comments
  • Research impact of tb_flush after init_forkserver

    Research impact of tb_flush after init_forkserver

    The current fork-server implementation on X64 might be negatively impacted by tb-flush in cpu-exec of unicorn: https://github.com/unicorn-engine/unicorn/blob/0551b56633f658ec760eac54c14712d712b746d7/qemu/cpu-exec.c#

    Problem: To start the fork-server, a single insn is executed. Afterwards, we exit the cpu loop and translated blocks are flushed. Since the parent will wait for translation requests at this first insn, all pre-jitted blocks on future children may simply be flushed and re-jitted after input from AFL is read.

    opened by domenukk 3
  • ucf fuzz: AFL forkserver error

    ucf fuzz: AFL forkserver error

    *] Spinning up the fork server...
    
    [-] Hmm, looks like the target binary terminated before we could complete a
        handshake with the injected code. Perhaps there is a horrible bug in the
        fuzzer. Poke <[email protected]> for troubleshooting tips.
    
    [-] PROGRAM ABORT : Fork server handshake failed
             Location : afl_fsrv_start(), src/afl-forkserver.c:726
    

    I created an image like the one from syzkaller and I was able to attach ucf to the breakpoint however I run into the issue above when I run ucf fuzz at the end of config.py

        if len(input) > 1500:
            import os
    
            os._exit(0)  # too big!
    
        # read input to the correct position at param rdx here:
        rdx = uc.reg_read(UC_X86_REG_RDX)
        rdi = uc.reg_read(UC_X86_REG_RDI)
        ucf.map_page(uc, rdx)  # ensure sk_buf is mapped
        bufferPtr = struct.unpack("<Q", uc.mem_read(rdx + 0xD8, 8))[0]
        ucf.map_page(uc, bufferPtr)  # ensure the buffer is mapped
        uc.mem_write(rdi, input)  # insert afl input
        uc.mem_write(rdx + 0xC4, b"\xdc\x05")  # fix tail
    
    def place_input(ucf: Unicorefuzz, uc: Uc, input: bytes) -> None:
        rax = uc.reg_read(UC_X86_REG_RAX)
        # make sure the parameter memory is mapped
        ucf.map_page(uc, rax)
        uc.mem_write(rax, input)  # insert afl input
    
    #init_func(Uc)
    #place_input(Unicorefuzz, Uc, AFL_INPUTS)
    
    
    opened by docfate111 2
  • Add Multiple Exit Support for More Architectures

    Add Multiple Exit Support for More Architectures

    Right now, multiple exits are only supported on X64 (X86_64). The way it works on X64 is as follows:

    • When mapping a page, a syscall insn is patched in for each exit in this page
    • When a syscall hook is triggered (cheap), the hook checks if it belongs to an exit
    • Exit.

    The other archs will need an alternative to syscall which is cheap in unicorn. For X86, UB2 might be an option. Any illegal instruction could maybe be used.

    opened by domenukk 2
  • Fixes for Dockerfile

    Fixes for Dockerfile

    Fixes an issue where docker build is waiting for input for tzdata. Also fixes an issue with keystone (see https://github.com/keystone-engine/keystone/issues/386)

    opened by ruudlinssen 1
  • self.exits is never initialized

    self.exits is never initialized

    when i execute ucf in debug mode, i get the following error:

      File "/mnt/data/unicorefuzz/unicorefuzz/harness.py", line 247, in uc_debug
        exit_point = self.exits[0]
    TypeError: 'NoneType' object is not subscriptable
    
    opened by cube0x8 0
  • no UC_AFL_NO_RETURN attribute in Uc object

    no UC_AFL_NO_RETURN attribute in Uc object

    when running in debug mode, I get the following error:

    File "/mnt/data/unicorefuzz/unicorefuzz/harness.py", line 232, in uc_debug
        if uc.afl_forkserver_start(exits) != uc.UC_AFL_RET_NO_AFL:
    AttributeError: 'Uc' object has no attribute 'UC_AFL_RET_NO_AFL'
    python-BaseException```
    opened by cube0x8 0
  • Added init gdb function to configspec

    Added init gdb function to configspec

    This additional initialization function allows to make basic preliminary steps before running the target under the debugger.

    For example, it is useful to disable watchdogs or patch specific region of memories before running.

    opened by cube0x8 0
  • Build issue

    Build issue

    Hello,

    I'm experimenting with unicorefuzz. This issue occurred while running setup.sh script:

    ....
    [+] Building unicorn_mode
    =================================================
    UnicornAFL build script
    =================================================
    
    [*] Performing basic sanity checks...
    [-] Error: Python setup-tools not found. Run 'sudo apt-get install python-setuptools'.
    

    I've already installed Python 2 and Python 3, as well as 'python-setuptools' via apt-get and pip install. What am I doing incorrectly?

    opened by behouba 2
  • Stop Avatar²/Python GDB from Busy Waiting

    Stop Avatar²/Python GDB from Busy Waiting

    Right now, the avatar² thread used for ucf attach waits for gdb in a busy loop, at least on linux. This is a bug in the gdb.py (or whatsitsname) dependency and needs to be fixed all the wayyyy upstream. I have a local yolo patch, I'll have to look how to get it upstreamed.

    opened by domenukk 0
  • Drop-In Allocator

    Drop-In Allocator

    Right now, there is no way to spot out-of-bounds reads or writes easily (unless the kernel has been compiled with KASAN or similar). A custom allocator similar to libdislocator.so would help a lot. One idea might be to, at the entry of kmalloc, patch in a jump to a similar emulated library and list the function (plus parameter mappings?) in the config. Another idea might be to leave mapping of unallocated mem completely to the python layer.

    opened by domenukk 0
Owner
Security in Telecommunications
The Computer Security Group at Berlin University of Technology
Security in Telecommunications
Fuzzing tool (TFuzz): a fuzzing tool based on program transformation

T-Fuzz T-Fuzz consists of 2 components: Fuzzing tool (TFuzz): a fuzzing tool based on program transformation Crash Analyzer (CrashAnalyzer): a tool th

HexHive 244 Nov 9, 2022
Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

Linux_kernel_exploits Some Linux kernel exploits for various real world kernel vulnerabilities here. More exploits are yet to come. This repo contains

Wei Wu 472 Dec 21, 2022
An AFL implementation with UnTracer (our coverage-guided tracer)

UnTracer-AFL This repository contains an implementation of our prototype coverage-guided tracing framework UnTracer in the popular coverage-guided fuz

null 113 Dec 17, 2022
FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware.

FIRM-AFL FIRM-AFL is the first high-throughput greybox fuzzer for IoT firmware. FIRM-AFL addresses two fundamental problems in IoT fuzzing. First, it

null 356 Dec 23, 2022
Driller: augmenting AFL with symbolic execution!

Driller Driller is an implementation of the driller paper. This implementation was built on top of AFL with angr being used as a symbolic tracer. Dril

Shellphish 791 Jan 6, 2023
FairFuzz: AFL extension targeting rare branches

FairFuzz An AFL extension to increase code coverage by targeting rare branches. FairFuzz has a particular advantage on programs with highly nested str

Caroline Lemieux 222 Nov 16, 2022
IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

Chair for Sys­tems Se­cu­ri­ty 146 Dec 16, 2022
MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

null 172 Dec 18, 2022
AFLFast (extends AFL with Power Schedules)

AFLFast Power schedules implemented by Marcel Böhme <[email protected]>. AFLFast is an extension of AFL which is written and maintained by Michal

Marcel Böhme 380 Jan 3, 2023
AFL binary instrumentation

E9AFL --- Binary AFL E9AFL inserts American Fuzzy Lop (AFL) instrumentation into x86_64 Linux binaries. This allows binaries to be fuzzed without the

null 242 Dec 12, 2022
Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

StochFuzz: A New Solution for Binary-only Fuzzing StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries.

Zhuo Zhang 164 Dec 5, 2022
The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

Orange 383 Dec 16, 2022
Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

BaseSAFE This repository contains the BaseSAFE Rust APIs, introduced by "BaseSAFE: Baseband SAnitized Fuzzing through Emulation". The example/ directo

Security in Telecommunications 138 Dec 16, 2022
null 571 Dec 25, 2022
Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Convolution This repository contains the implementation (in PyTorch) of the paper FULLY CONVOLUTIONAL MESH AUTOENCODER USING EFFICIENT SPATIALLY

Yi_Zhou 35 Jan 3, 2023
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"

KPAC: Kernel-Sharing Parallel Atrous Convolutional block This repository contains the official Tensorflow implementation of the following paper: Singl

Hyeongseok Son 50 Dec 29, 2022
Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification T M Feroz Ali, Subhasis Chaudhuri, ICVGIP-20-21

T M Feroz Ali 3 Jun 17, 2022
Differential fuzzing for the masses!

NEZHA NEZHA is an efficient and domain-independent differential fuzzer developed at Columbia University. NEZHA exploits the behavioral asymmetries bet

null 147 Dec 5, 2022
InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing

InsTrim The paper: InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing Build Prerequisite llvm-8.0-dev clang-8.0 cmake >= 3.2 Make git cl

null 75 Dec 23, 2022