Fuzzing: The art of confusing code until it confesses its vulnerabilities!
Introduction
Ensuring code robustness and security is paramount in the dynamic landscape of software development. Fuzz testing helps developers identify and fix critical bugs in SDLC. It involves feeding unexpected inputs to a program to discover vulnerabilities, crashes, or unexpected behaviors.
This cybersecurity blog will show how to apply fuzz testing using libFuzzer to the dlib project, a popular C++ machine learning library. Dlib is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge.
What is Fuzzing
Fuzzing, or fuzz testing, is a software testing technique that automatically feeds a program with a wide range of unexpected and invalid inputs to uncover vulnerabilities, bugs, crashes, and flaws. The primary goal of fuzzing is to identify how a program responds to inputs for which it was not explicitly designed to handle, often revealing crashes, memory leaks, security vulnerabilities, and other unexpected behaviors.
Fuzzing involves generating multiple test inputs, often by mutating or developing random data (from a "seed") and feeding these inputs into a target program. The idea is to explore various paths of execution within the program, including edge cases and rare scenarios, to identify weaknesses that attackers might exploit or could lead to system failures. Fuzzing can be applied to various types of software, including applications, libraries, operating systems, network protocols, and more. It is particularly effective at uncovering memory corruption issues like buffer overflows, use-after-free vulnerabilities, format string vulnerabilities, and other security-related problems.
There are multiple approaches to fuzzing, including:
Random Fuzzing: This approach involves generating and providing random inputs to the program. While simple, random fuzzing can sometimes uncover fundamental issues. However, it may miss more complex vulnerabilities that require targeted inputs.
Mutation-Based Fuzzing: This technique starts with an initial set of valid inputs and then gradually mutates or modifies them in various ways to create new test cases. This approach often leads to the discovery of more targeted vulnerabilities.
Grammar-Based Fuzzing: The inputs are generated based on a specified grammar or syntax. This approach can be more effective in creating structured and meaningful test cases.
Coverage-Guided Fuzzing: This is a more advanced approach where the fuzzer tracks code coverage during testing. Inputs that lead to new, unexplored code paths are prioritized, allowing the fuzzer to discover more profound vulnerabilities.
Fuzzing is a critical component of modern software development and security practices. By subjecting software to a diverse set of inputs, it helps identify and fix bugs early in the development lifecycle, reducing the potential attack surface and enhancing the overall security of the software.
What is libFuzzer
LibFuzzer is a library to assists in the fuzzing of applications and libraries. LibFuzzer is an in-process, coverage-guided, evolutionary fuzzing engine. Google's libFuzzer was a part of the LLVM project and is widely used for automated software testing through fuzzing. It follows a coverage-guided approach that explores new code paths based on feedback.
It uses a feedback-driven approach, which means it can continuously mutate the input data based on the feedback from the program's execution, thus helping uncover a wide range of bugs and crashes, including memory corruption issues and security vulnerabilities in C/C++ code. .
Prerequisites to understanding fuzzing with libFuzzer
Familiarity with C++ programming.
Installing libFuzzer
To start with libFuzzer on Kali Linux, open a terminal and run the below commands to install LLVM and Clang. If you want to understand how to set up a virtual machine with Kali Linux, please refer here and here.
sudo apt-get update
sudo apt-get install llvm clang
Fuzzing with libFuzzer with dlib
Once libFuzzer is installed, we can begin fuzzing the dlib project's imglab tool.
Clone the dlib Repository
Clone the dlib repository to your home directory using the following command:
git clone https://github.com/davisking/dlib.git
Build Imglab Tool
Navigate to the dlib/tools/imglab directory and build the imglab tool using the following commands.
cd ~/dlib/tools/imglab
mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF
cmake --build .
Create a Fuzzing Target
Create a new C++ file "fuzzer.cpp", and implement the LLVMFuzzerTestOneInput function. In this case, you'll need to call the imglab tool from within your fuzzer function. Save the below file in the path ~/dlib/tools/imglab.
#include <cstdint> #include <cstdlib> #include <cstdio> #include <string> extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { FILE* temp_input = std::tmpfile(); if (temp_input == nullptr) { return 0; } if (std::fwrite(Data, 1, Size, temp_input) != Size) { std::fclose(temp_input); return 0; // Error handling: Return a non-zero value to indicate failure } std::fflush(temp_input); const char* imglab_command = "./imglab --stats ~/dlib/examples/faces/testing.xml"; std::string full_command = std::string(imglab_command); int result = std::system(full_command.c_str()); std::fclose(temp_input); return result; }
Compile the Fuzzing Target
Compile the fuzzer.cpp using the appropriate compilation flags. In a terminal, run the below commands. The -fsanitize=fuzzer flag tells the compiler to enable libFuzzer instrumentation for the compiled code.
cd ~/dlib/tools/imglab
clang++ -std=c++14 -O2 -g -fsanitize=fuzzer,fuzzer-no-link fuzzer.cpp -o fuzzer -lpthread
Creating an Initial Corpus
Here "corpus" refers to a collection of input data that is used to feed into a fuzzing tool. This input data can include various files, inputs, or commands that trigger different paths and behaviors within a target application or software. The goal of using a corpus in fuzzing is to thoroughly test the software by providing a wide range of inputs that could uncover vulnerabilities or crashes. A corpus can include both valid and invalid inputs, as well as edge cases and random data.
Create a directory where you'll store the initial corpus. Copy test files, valid inputs, or other data that exercise different aspects of the imglab tool into the corpus folder.
mkdir corpus
cp ~/dlib/examples/faces/testing.xml corpus
Running the Fuzzer
Execute the compiled fuzz target using libFuzzer. Pass the -max_len parameter to control the maximum input size that libFuzzer generates.
./fuzzer -max_len=8192 -artifact_prefix=corpus/
Monitor and Analyze libFuzzer instance
While running, libFuzzer automatically generates and tests inputs for the imglab executable, providing feedback on issues such as crashes, hangs, or anomalies. It creates subdirectories within the corpus directory to organize different artefacts, including crashes and new_units, which store crash reproducers and newly discovered test cases.
As software development prioritizes security and reliability, fuzzing remains and will remain a critical tool in the arsenal of Professional Penetration Testers to achieve their goals. The successful integration of libFuzzer with the dlib library stands as a testament to the efficacy of fuzz testing in modern software development.
Register for instructor-led and on-demand courses today!
Check out our free programs!
Contact us with your custom pen testing needs at: info@darkrelay.com or WhatsApp.
Comments