Chef is a platform for obtaining symbolic execution engines for interpreted languages. It works by reusing the interpreter itself as an executable language specification, thus reducing the effort of obtaining an engine to a matter of days. The resulting engines can be used like any other engine for finding bugs, generating high-coverage test suites, assisting in debugging, and more.
Obtain a correct engine in days
Chef requires only lightweight modifications to the interpreter implementation, which can be performed in a few days. In turn, the resulting interpreter binary becomes an executable specification of the language that can be directly plugged into Chef. Chef relies on the S2E x86 symbolic analysis platform to faithfully execute symbolically the interpreter binary and its entire environment.
Python and Lua engines included
We obtained with Chef symbolic execution engines for Python and for Lua. We have already used them to generate test suites for popular library packages and uncover new bugs. Moreover, such engines can serve as a reference implementation when implementing an efficient engine from scratch. Both engines are available together with Chef.
Class-uniform path analysis
Chef’s “secret sauce” is a novel technique called Class-Uniform Path Analysis (or CUPA). During symbolic execution, CUPA prioritizes those executions that avoid stalling the global exploration progress. This happens when the execution encounters a program region with disproportionately more branching points than the others (e.g., a complex interpreter instruction).
Prototyping Symbolic Execution Engines for Interpreted Languages Stefan Bucur, Johannes Kinder, and George Candea. Intl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Salt Lake City, UT, March 2014
The core paper behind Chef. It introduces Class-Uniform Path Analysis (CUPA), the execution path selection technique that makes it practical to use the interpreter implementation during symbolic execution.
Making Automated Testing of Cloud Applications an Integral Component of PaaS Stefan Bucur, Johannes Kinder, and George Candea. Asia-Pacific Workshop on Systems (APSYS), Singapore, July 2013
This paper is an example of the applications enabled by Chef. It proposes a way of testing cloud applications written for PaaS environments such as Google App Engine, and relies on the availability of an in-vivo symbolic execution engine for web languages like Python and Ruby.
Getting the Code
Frequently Asked Questions
Why do we need symbolic execution engines for interpreted languages?
Symbolic execution automates testing tasks such as bug finding, generating high-coverage test suites, or debugging. Unfortunately, the existing symbolic execution engines target compiled and low-level code, while leaving out dynamic (interpreted) languages that are increasingly popular in web applications and system integration. We want to bring the benefits of symbolic execution to this class of programs, too.
Why is it hard to build a symbolic execution engine?
A symbolic execution engine is essentially a more sophisticated interpreter for a language, because it needs to track all the program operations symbolically, instead of computing their concrete results. When doing this for interpreted languages, the challenges are amplified by the complexity of the language semantics, the ambiguous specifications, language evolution, and so on. Case in point: There are no symbolic execution engines today that completely target Python or Ruby, Lua, etc.
For popular languages like Python, why not just invest resources to build an efficient symbolic execution engine from scratch? That would be amortized by the large number of target applications.
Absolutely. As prototypes, Chef-obtained symbolic execution engines are still useful in the process of building another engine from scratch. They can serve as a correct reference implementation during development, and can be used to preliminarily assess the benefits of a symbolic execution engine for that language (e.g., by determining what kind of bugs could be found).
Does Chef support very large language implementations, such as Java’s?
Large languages such as Java are by choice not a target for Chef at the moment.
What symbolic data types does a Chef-obtained engine support?
Symbolic integers and strings. Chef is bounded by the expressivity of the constraint solver logic in the underlying symbolic execution engine.
The interpreter “optimizations” actually slow down execution. How come?
Optimizations such as degenerate hashes and fast path avoidance do slow down a linear (per-path) execution, but the effect is not significant against the savings in symbolic execution.
How Can I Contribute?
There are several ways you can help the project:
- Use Chef to enable symbolic execution in other languages, such as Ruby or Bash. We’d love to hear how Chef performs for other languages.
- Discover new optimization opportunities. Hash neutralization, memory size concretization, and the other optimizations mentioned in the Chef paper help reduce the path explosion, but there is always room for improvement.
Chef builds on top of our S2E in-vivo symbolic analysis platform.
You can ask Chef-specific questions on the S2E developer mailing list.