Thorough testing of dependable software systems requires ways to productively employ fault injection. Yet, fault injection is rarely used in software development. We believe this is because meaningful fault injection requires too much manual labor and, if not done carefully, can result in many false positives.
In response to this, we developed LFI, a tool suite that automatically identifies the errors exposed by shared libraries, finds potentially buggy error recovery code in program binaries, and produces corresponding injection scenarios. LFI injects the desired faults – in the form of error return codes and corresponding side effects – at the boundary between shared libraries and applications. Since LFI automatically profiles fault behaviors of libraries via static analysis of their binaries, it reduces the dependence on human labor and correct documentation. LFI also allows developers to write precise custom triggers for controlled fault injection experiments.
LFI does not require access to libraries' source code and works for Linux, Windows, and Solaris on x86 and SPARC platforms. In our use, LFI has automatically found many serious bugs in systems like the BIND name server, Git version control system, MySQL database server, PBFT replication system, and Pidgin IM client. LFI also achieved 35%-60% improvement in recovery-code coverage without requiring any new tests.