Systems Seminar

EPFL IC Systems Seminar

A Sample of Software Engineering Research Opportunities at YouTube


The implementation and evolution of a system with YouTube’s scale and complexity generates many interesting technical challenges. In this talk, I will survey a number of topics that I have encountered in my 3 years working as a software engineer in the YouTube data processing organization. I will focus on highlighting research opportunities, rather than communicating research results.

For example, Google uses a single integrated code repository with over 2 billion lines of code and a distributed build system, encouraging developers to re-use code from anywhere in Google. In such an environment, how do we need to change traditional software build tools, such as compilers and linkers to keep the edit-build-test cycle short?

As another example, at YouTube all changes are monitored to prevent unexpected regressions in the user experience. For this purpose, we construct a data structure that describes the user journey, and we compute a large number of metrics by traversing this data structure. If we code these metrics in a general-purpose language we have plenty of expressiveness, but we sacrifice analyzability and debugability. For example, we want to make it easy to explain the value of a metric computation for one journey, and even to explain what causes the aggregate change of a metric over a group of journeys. Many of these metrics have similar computations; we want to be able to optimize their execution by re-using sub-computations. Could these benefits, and others, be achieved with a domain-specific language for metrics?


George Necula is an engineer at Google, formerly a professor at U.C. Berkeley. His research interests are in programming languages and software engineering, with a particular focus on software verification and formal methods. His work on proof-carrying code earned the SIGPLAN Most Influential POPL Paper Award. He has developed open-source analysis, verification, and transformation tools for C, including the C Intermediate Language (CIL), CCured, and Deputy. He developed the Extensible Shape Analysis (XISA) and Random Interpretation techniques. George is a Fellow of the Alfred P. Sloan Foundation and a recipient of the Grace Murray Hopper Award and the ACM SIGOPS Hall of Fame Award.