Fuzzing - Automatisiertes Schwachstellen-Testing
Fuzzing (fuzz testing) is an automated testing method in which a program is bombarded with vast amounts of random or systematically mutated input to trigger crashes, assertions, or unexpected behavior that indicates security vulnerabilities. Coverage-guided fuzzers such as AFL++ and LibFuzzer maximize code coverage. Key areas of application: network protocols, file parsers, browser engines, cryptography libraries.
Fuzzing is one of the most effective methods for finding security vulnerabilities in software that manual code reviews and static analysis overlook. A fuzzer automatically generates massive amounts of test cases, sends them to the target program, and monitors whether it crashes, hangs, or produces unexpected errors—indicators of buffer overflows, use-after-free, or other memory-based vulnerabilities. Google OSS-Fuzz has found over 10,000 vulnerabilities in open-source software using fuzzing.
Types of Fuzzing
Fuzzing Taxonomy:
1. Black-Box Fuzzing (no source code access):
→ Generates inputs without knowledge of the program structure
→ Easy to start, but low code coverage
→ Example: random HTTP requests to a web API
→ Tools: Peach Fuzzer, Boofuzz (network protocols)
2. Grey-Box Fuzzing / Coverage-Guided (state of the art):
→ Instruments the binary → must measure code coverage
→ Retains inputs that discover new code paths (genetic algorithm!)
→ Exponentially more efficient than black-box
→ Tools: AFL++ (American Fuzzy Lop), LibFuzzer, HonggFuzz
Process:
a. Fuzzer starts with seed corpus (real test files)
b. Mutates inputs: bit flips, byte swaps, interesting values
c. Runs program, measures coverage
d. New code path reached? → Keep input in corpus!
e. No new path? → Discard input
f. Crash? → Bug found! Save input as crash reproducer
3. White-Box Fuzzing / Symbolic Execution:
→ Symbolically analyzes source code + path conditions
→ Generates inputs that cover exact code paths
→ Tools: KLEE (LLVM), angr, S2E
→ Very computationally intensive, but precise for complex logic
Fuzzing categories by target:
□ File parsers: PDF, PNG, MP4, Office documents → LibFuzzer
□ Network protocols: HTTP, TLS, DNS, SMB → Boofuzz, Peach
□ Web APIs: REST, SOAP, GraphQL → RESTler, CATS
□ Browser engines: V8, SpiderMonkey → Domino, Dharma
□ Cryptographic libraries: OpenSSL, BoringSSL → Google OSS-Fuzz
□ Operating system kernels: syscall fuzzing → syzkaller
□ Embedded/IoT: Firmware fuzzing → FIRM-AFL, Firmfuzz
Coverage-Guided Fuzzing with AFL++
AFL++ (American Fuzzy Lop++) - Practical Guide:
Installation:
apt install afl++ # Ubuntu
brew install afl++ # macOS
Step 1: Instrument the binary (source available):
# C/C++ with AFL compiler wrapper:
CC=afl-clang-fast ./configure
make
→ Generates instrumented binary with coverage tracking
Step 2: Create seed corpus:
mkdir seeds/
echo '{"user":"test"}' > seeds/basic.json
echo '{"user":"a"}' > seeds/short.json
# Real samples → better start!
# Tools: afl-cmin (minimizes corpus)
Step 3: Start fuzzing:
afl-fuzz -i seeds/ -o findings/ ./target_binary @@
# @@ = placeholder for input file
# -i: Input corpus directory
# -o: Output directory (crashes, hangs, queue)
Step 4: Evaluate results:
ls findings/crashes/ → Inputs that reproduce crashes!
ls findings/hangs/ → Inputs that lead to timeouts!
ls findings/queue/ → Inputs that increase coverage
Interpreting AFL++ output:
execs/sec: > 500 → good performance
stability: should be > 90%
coverage: how many paths discovered
crashes: number of crashes → analyze immediately!
Parallel fuzzing (multiple CPU cores):
# Master:
afl-fuzz -M fuzzer01 -i seeds/ -o findings/ ./target @@
# Slaves (number = CPU cores - 1):
afl-fuzz -S fuzzer02 -i seeds/ -o findings/ ./target @@
afl-fuzz -S fuzzer03 -i seeds/ -o findings/ ./target @@
→ Share corpus among each other → better coverage!
Crash analysis:
# Reproduce:
./target findings/crashes/id:000001
# Debugging with AddressSanitizer:
CC=afl-clang-fast ASAN_OPTIONS=... ./configure
make
ASAN_OPTIONS=detect_leaks=0 ./target findings/crashes/id:000001
LibFuzzer (Google/LLVM)
LibFuzzer - For libraries and code units:
LibFuzzer vs. AFL++:
→ LibFuzzer is in-process: no new process per input
→ Faster (no fork() overhead)
→ Ideal for libraries/parsers (libpng, zlib, JSON parsers)
→ Coverage via LLVM SanitizerCoverage
Writing a fuzz target (C/C++):
// fuzz_target.cpp
#include
<stdint.h>#include<stddef.h>
#include "my_parser.h"
extern "C" int LLVMFuzzerTestOneInput(
const uint8_t *Data, size_t Size) {
// This code is called with every fuzzer input
parse_json(Data, Size);
return 0; // 0 = no crash expected
}
Compile and run:
clang++ -g -O1 -fsanitize=fuzzer,address \
fuzz_target.cpp my_parser.cpp -o fuzzer
./fuzzer corpus_dir/ -max_len=4096 -jobs=4
With AddressSanitizer (recommended!):
-fsanitize=fuzzer,address,undefined
→ Finds memory bugs + undefined behavior
Go fuzzing (since Go 1.18, built-in):
func FuzzReverse(f *testing.F) {
f.Add("hello") // Seed
f.Fuzz(func(t *testing.T, s string) {
rev := Reverse(s)
// Invariance: double reverse = original
if Reverse(rev) != s {
t.Errorf("Reverse(%q) = %q, want %q", s, rev, s)
}
})
}
go test -fuzz=FuzzReverse -fuzztime=60s
Python Fuzzing (Atheris):
pip install atheris
import atheris
import sys
@atheris.instrument_func
def TestOneInput(data):
fdp = atheris.FuzzedDataProvider(data)
input_str = fdp.ConsumeUnicodeNoSurrogates(100)
your_function(input_str)
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()
Web API Fuzzing
REST API Fuzzing:
Tools:
RESTler (Microsoft): automated REST API fuzzing
→ Reads OpenAPI/Swagger spec
→ Generates test sequences based on API dependencies
→ Finds business logic errors, crashes, auth bypasses
CATS (REST API Fuzzer):
→ OpenAPI-based
→ 97+ built-in fuzz tests per endpoint
→ Finds: XSS, SQLi, SSRF, access control flaws
ffuf (for HTTP endpoints):
# Parameter fuzzing:
ffuf -u 'https://api.example.com/user?FUZZ=value' \
-w /usr/share/wordlists/common.txt
# Body fuzzing:
ffuf -u 'https://api.target.com/api/create' -X POST \
-H 'Content-Type: application/json' \
-d '{"name": "FUZZ", "type": "user"}' \
-w mutations.txt
Manual fuzzing approaches (Burp Suite):
Intruder → Cluster Bomb for multiple parameters:
→ Payload 1: Numbers 0-65535 (integer overflows)
→ Payload 2: Special characters (<, >, ', ", null bytes)
→ Combined: all parameter combinations
Interesting fuzz values:
Numbers: 0, -1, 2147483647 (MAX_INT), 2147483648, 9999999999
Strings: "", null, "null", "\x00", "A"*4096 (Buffer)
Arrays: [], [null], very large arrays
Types: String instead of Int, Boolean instead of String
Unicode: "\u0000", Emoji, RTL characters, very long strings
Continuous Fuzzing (CI/CD Integration)
OSS-Fuzz (Google) - for open-source projects:
→ Google runs continuous fuzzing for open-source
→ 1000+ projects: OpenSSL, libjpeg, curl, SQLite, etc.
→ Fuzzing results: 10,000+ vulnerabilities found
→ Free for eligible open-source projects
Google ClusterFuzz / FuzzBench:
→ Enterprise-grade distributed fuzzing
→ Scales to thousands of CPU cores
→ Automatic crash deduplication + triage
CI/CD Integration (GitHub Actions example):
# .github/workflows/fuzzing.yml
- uses: google/oss-fuzz/infra/cifuzz@master
with:
oss-fuzz-project-name: 'myproject'
fuzz-seconds: 600 # 10 minutes of fuzzing per PR
→ Regression fuzzing: new code → automatically fuzzed
→ If a crash occurs: PR blocked, bug report created
Best Practices:
□ Seed corpus from real production data
□ AddressSanitizer (ASAN) + UndefinedBehaviorSanitizer (UBSAN) always enabled
□ Save crash reproducer + include in regression tests
□ Dictionary for log keywords (improves coverage)
□ AFL_PRELOAD + afl-cov for coverage reports
```</stddef.h></stdint.h>