Below are some ideas for projects in DSLAB, but we always welcome (and often prefer) students’ own ideas. Please contact George Candea for further details.

(Last updated in September 2017)

PhD Semester Projects

Below are some example projects, but we usually define PhD semester projects together with the student, to make sure they work on something that is exciting to them. So take the examples below as hints as to what your project could be like.

CTF Assistant

In computer security, Capture-the-Flag (CTF) competitions require participants to conduct and react to the types of attacks that occur in the real world. This requires reverse engineering, network sniffing, protocol analysis, sysadmin skills, programming, and sometimes cryptanalysis. Our S2E can be a very powerful assistant for such competitions. This project is about providing an API on top of S2E that is accessible in a REPL and that enables CTF competitors to quickly answer typical questions about programs and OS kernels: Can there be a buffer overflow? What happens in a use-after-free scenario? Etc.

Prerequisites: fluent C/C++, low-level systems programming, CTF experience

Secure Tor Browser

The Tor anonimity project enables people to communicate in a way that preserves their anonimity. One of the primary ways of using Tor is via the Tor browser, and (like any piece of software) it has numerous security vulnerabilites. This makes it easy for a determined attacker (e.g., foreign government, cyber mafia) to observe victims’ communications by merely subverting the browser. This project is about using our CPI technique to build a Tor browser that is completely immune to control-flow hijack attacks and make it available to all Tor users.

Prerequisites: fluent C/C++, systems programming

Lightweight Security Isolation Domains

Develop operating system primitives for creating “application enclaves”: ligthweight isolation domains for securely executing applications together with their data in a way that is transparent to end users. Potential to leverage new Intel and ARM hardware-enforced security extensions.

Prerequisites: fluent C/C++, OS kernel programming

High-Performance Certified Network Devices

A variety of projects related to the Vigor project.

Prerequisites: fluent C/C++, networking, systems programming

Embedded/IoT Software-Hardware Co-Verification

Although hardware devices (wireless routers, webcams, automobiles, etc.) typically work in tandem with their own device drivers and/or user-space software, the two components are often developed and tested in isolation. This allows for incompatibilities to creep in that lead to security vulnerabilities, crashes, and other misbehaviors. Even though automated testing and verification methods for both hardware and software have progressed significantly in recent years, integration testing is a mostly manual effort. The goal of this project is to extend S2E to symbolically execute hardware models described in a hardware description language such as SystemVerilog or SystemC. The resulting system will be able to systematically test a device driver in combination with a model of its hardware (hardware-software co-verification). Compared to the state of the art, such a system will have higher chances of correctly identifying unwanted behavior, while at the same time reducing the cost of integration testing.

Prerequisites: fluent C/C++, low-level systems design, hardware and logic design

Direct Analysis of LLVM Binaries in S2E

S2E analyzes x86 binaries by translating them to LLVM bitcode on-the-fly and then running the resulting LLVM bitcode in the KLEE symbolic execution engine. The translation process is heavyweight and generates LLVM bitcode that is much larger than what a native LLVM compiler would generate from source code (e.g., llvm-gcc); this slows down symbolic execution. This project is about enabling S2E to run applications compiled natively to LLVM bitcode. The challenge is to create the illusion of executing a native application for the guest OS, and of communicating with a library (that you would write) to the LLVM application. This project will make you fluent in LLVM and x86 assembly, and you will acquire deep knowledge of binary translators, system emulators, and symbolic execution engines.

Prerequisites: fluent C/C++, low-level systems programming, assembly and/or LLVM

Projects for MS/BS-level Semester Projects/Theses/Internships

The projects below can be adapted to the scope of a Bachelor/Master semester project or a Master Thesis project. If you would like to do your semester project as a team, we can discuss the scope and distribution of the work.

Translating x86 Binaries to LLVM

LLVM is an open source compiler infrastructure designed to perform efficient optimizations throughout the lifetime of the program (e.g., compile time, link time, run time). We use LLVM for a number of tasks such as debugging, reverse engineering, and symbolic execution. For all these tools, we currently use QEMU to translate x86 instructions to LLVM, on the fly, as they execute on the CPU. In this project, you will port this translator to an offline conversion tool that takes an x86 binary and produces LLVM bitcode. You will then work on optimizing the resulting LLVM bitcode, so that it can run efficiently in automated testing tools that work with LLVM, such as S2E and KLEE.

Prerequisites: fluent C/C++, low-level systems programming, (de)compilers, computer architecture

Multi-Path Fault Injection

Programs contain error recovery code to deal with unexpected situations. For example, Java programs have try/catch blocks to handle errors that might occur when connecting to a server, C++ programs check the return value of system calls to deal with out-of-memory errors, and so on. Testing such error recovery code is challenging because errors usually happen rarely, and many of them are hard to trigger in a test environment. To address this, we developed LFI , a tool suite for library-level fault injection. LFI intercepts library calls in binary programs and injects errors according to user-specified fault scenarios, which allowed it to find many bugs in programs such as MySQL, Bind, Git, Pidgin, etc. The problem with LFI is that it is still a single-path analysis tool, and so cannot efficiently find corner case situations in target code, and it does not generate a concrete set of inputs to easily reproduce some types of bugs. The aim of this project is to implement LFI as an S2E plugin in order to solve the challenges faced by the single-path LFI.

Prerequisites: fluent C/C++, low-level systems programming

Certified Least Prefix Match Table

IP routing, a classic network function is based on LPM table. In order to enable Vigor approach, we need a certified implementation of this datastructure.

In this project, you will develop a formal specification of an LPM table, implement it in C, and prove that it correctly implements the specification. You can use a proof-assistant (such as CoQ) and have your code transformed with VST; or use an automated verifier (such as VeriFast); or propose your own tool/method.

Prerequisites: familiarity with a C-compatible program verifier, such as VeriFast, Frama-C/{Jessie,WP}, VST