Intro to Rust Fuzzing

2022-03-04

fuzzing , rust , reverse engineering

Background

It’s been a while since the last adventure, hopefully we return with a good one! I’ve been learning Rust and I have done some fuzzing work in the past with other languages and it never occured to me until recently when I asked myself…

“How does fuzzing work in Rust?”

So that’s exactly what I set off to learn and hopefully by sharing some things I learned along the way it can help others also curious about how to fuzz in the Rust ecosystem.

RustFuzz

Fuzzing Introduction

As soon as you bring up fuzzing you need to specificy what kind of Fuzzing you intend to do, black-box (no harness/no source) vs grey-box (harness, coverage guided) vs white-box (source code, symbolic execution). In this example we are going to be doing grey-box fuzzing where we will create a harness to fuzz a specifically targeted function and feed crafted inputs into it.

When it comes to Fuzzing there are a couple of really well known tools in the area that you can leverage. AFL (https://lcamtuf.coredump.cx/afl/) and libFuzzer (https://llvm.org/docs/LibFuzzer.html) are two of the biggest and well known ones. I’m not going to pretend to know how each of them operates, that’s usually where the PhDs come in. As users that use these tools we need to only know how to leverage the tool, not how the tool internals work.

When setting up a fuzzing harness to fuzz a specific function we want to make sure we are only using the bare minumum of code that is required in order for our target function to run. This is for two reasons, the first is we want to make sure we are fuzzing only the target function and not accidentally hitting other code paths that aren’t interesting to us. The other is for performance, while during one iteration of a function run doing some simple extra calls might seem like no big deal. But with fuzzing, we will be running this harness potentially millions of times, so every call matters and all those extra calls add up over time.

Choosing a Target

The first step when beginning a fuzzing endevour is to choose a target. Having done this in the past with other languages, some of the easiest targets to start with are ones where the nature of the program is handling input or writing output. The reason for this is twofold, one because of the input/output mechanism of the software it will make writing your harness much easier since there will be nicely defined calls to do exactly what you want. The other reason is because input/output programs directly deal with user submitted input which means that when we potentially find bugs they can immediately have impact.

So to start my search for a target I went to https://crates.io/ the Rust package manager and looked for crates that had the keyword archive (file archive formats). Archive Keyword

These are all great targets! I ended up choosing rust-ar (https://crates.io/crates/ar) from that list above, and we were off to get fuzzing!

Setting Up the Environment

Earlier in the post I specicifcally mentioned LibFuzzer and AFL, the reason for that is because both of them are actively supported for fuzzing Rust programs.

AFL - https://github.com/rust-fuzz/afl.rs
LibFuzzer - https://github.com/rust-fuzz/cargo-fuzz

For this post I decided to go with LibFuzzer because I was curious about the cargo-fuzz integration.

The first thing we want to do is clone the Rust project that we want to fuzz git clone git@github.com:mdsteele/rust-ar.git
Next we need to switch over to the nightly Rust toolchains. The reason for this is because when we run cargo fuzz later it will use command line flags that are not enabled in the stable Rust builds. rustup default nightly
Then we need to install cargo-fuzz cargo install cargo-fuzz
The last step before we our environment setup is to run the initialization. This command should create a folder called fuzz within the repository. Inside of fuzz there will be a subfolder called fuzz/fuzz_targets which is where the harnesses live, which we will be creating next! cargo fuzz init

├── fuzz_targets
│   └── fuzz_target_1.rs

Creating our Harness

After initialization, a skeleton of a harness will be created for us in our empty target file. Note: You can also view your fuzzing targets with cargo fuzz list.

#![no_main]
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // fuzzed code goes here
});

The body of the fuzz_target!() macro is going to be repeatedly called with a slice of pseudo-random bytes (data), until the harness hits an error condition (segfault, panic, etc). Our objective is to write some code within fuzz_target that will interact with a function within our target of interest. So in this case we want to continously call some piece of code within rust-ar.

After looking at rust-ar(https://github.com/mdsteele/rust-ar), it looks like there is nicely seperated example program for how to do a simple archive extraction. Rust AR Extract Example

This is the perfect starting point for creating a fuzzing harness. We will need to strip away the sections that deal with user input and the error handling surrounding input/ouput files.

Instead of just showing the final product, lets go into why we need to make changes, how we make them. Originally this example program was written to be a command line tool, but for our fuzzing use case, we only want to rapidly pass in bytes that would act as a file, no user interaction required.

That means we can remove this user argument handling section right away.

let num_args = env::args().count();
    if num_args != 2 {
        println!("Usage: extract <path/to/archive.a>");
        return;
    }

    let input_path = env::args().nth(1).unwrap();
    let input_path = Path::new(&input_path);

The next two lines are important, and we need to replicate those within our harness. The reason they are important is because the first line is responsible for grabbing the file handle of the archive which will be extracted. When migrating to a fuzzing structure, we want to no longer use a file on disk and instead pass in the bytes generated by our fuzzer and use those as the file contents.

let input_file =
        File::open(input_path).expect("failed to open input file"); // grab file handle to archive on disk
    let mut archive = ar::Archive::new(input_file); // attempt to open archive file with library

So we can replace that with the below. Using the Cursor(https://doc.rust-lang.org/std/io/struct.Cursor.html) struct to replicate a having a file handle in memory only, no disk interaction required.

let reader = std::io::Cursor::new(data); // convert our bytes to a Rust cursor
let mut archive = ar::Archive::new(reader); // user our reader instead of a file handle

The rest of this example program is actually writing the files out the disk, which we don’t need to do, since the archive parsing already happened earlier. So we can remove these lines.

while let Some(entry) = archive.next_entry() {
        let mut entry = entry.expect("failed to parse archive entry");
        let output_path = Path::new(
            str::from_utf8(entry.header().identifier())
                .expect("Non UTF-8 filename"),
        )
        .to_path_buf();
        let mut output_file = File::create(&output_path)
            .expect(&format!("unable to create file {:?}", output_path));
        io::copy(&mut entry, &mut output_file)
            .expect(&format!("failed to extract file {:?}", output_path));
    }

The last change is simply some interaction with the archive object to see if we can cause a panic. Depending on your goal you might not need this.

let num_entries = archive.count_entries();

The completed harness, it might not be large and complicated, but it is very powerful!

Live and Let Fuzz

Now that our fuzzing is ready to run, the last step is to run it! cargo fuzz run fuzz_target_1

After running for a few seconds I was greeted with the following panic. Harness Panic We can see from the logs exactly what the type of panic was 'range start index 200000000000000 out of range for slice of length 0' and what line caused it rust-ar/src/lib.rs:269

But more imporantly if we scroll down further in the crash output we can see the exact input that caused the crash in order for us to reproduce! Harness Input Additionally cargo fuzz nicely tells us that the problem file can be view at fuzz/artifacts/fuzz_target_1/crash-c443d284c03bbbf650cb6f89fe26676d62a1054d and can be reran through the harness with cargo fuzz run fuzz_target_1 fuzz/artifacts/fuzz_target_1/crash-c443d284c03bbbf650cb6f89fe26676d62a1054d

If we take a look at the file cargo fuzz generated we can see that with NO starting input given it was able to traverse and generate a problematic valid archive file according to this library!

00000000: 213c 6172 6368 3e0a 213c 9f95 6368 3e0a  !<arch>.!<..ch>.
00000010: db57 dbdb db00 3030 3030 3030 3030 3030  .W....0000000000
00000020: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
00000030: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
00000040: 3030 2331 2f32 3030 3030 3030 3030 3030  00#1/20000000000
00000050: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
00000060: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
00000070: 3030 3030 3030 3030 3030 3032 3931 3200  000000000002912.
00000080: 00                                       .

And that’s the end of our fuzzing journey, for this particular issue, I reported it to the author and they were awesome and got it fixed in a few days. https://github.com/mdsteele/rust-ar/issues/22