Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications

George Candea, Mauricio Delgado, Michael Chen, Armando Fox

Proc. 3rd IEEE Workshop on Internet Applications (WIAPP), San Jose, CA, June 2003

[ PDF ]


Automatic Failure-Path Inference (AFPI) is an application-generic, automatic technique for dynamically discovering the failure dependency graphs of componentized Internet applications. AFPI's first phase is invasive, and relies on controlled fault injection to determine failure propagation; this phase requires no a~priori knowledge of the application and takes on the order of hours to run. Once the system is deployed in production, the second, non-invasive phase of AFPI passively monitors the system, and updates the dependency graph as new failures are observed. This process is a good match for the perpetually-evolving software found in Internet systems; since no performance overhead is introduced, AFPI is feasible for live systems. We applied AFPI to J2EE and tested it by injecting Java exceptions into an e-commerce application and an online auction service. The resulting graphs of exception propagation are more detailed and accurate than what could be derived by time-consuming manual inspection or analysis of readily-available static application descriptions.