Systems Seminar

EPFL IC Systems Seminar

The Infrastructure to Build Large Language Models



Abstract

Training Large Language Models (LLMs) on traditional distributed clusters often leads to wasted compute and necessitates significant software engineering skills due to inherent communication bottlenecks. Cerebras’ transformative hardware solutions are designed to tackle these hurdles. The innovative Wafer Scale Cluster architecture ensures near-linear scaling, drastically reducing computational waste and simplifying the complex engineering demands typical of traditional distributed clusters. Furthermore, our pioneering Weight Streaming architecture enables full attention training with sequence lengths of up to 50k tokens for multi-billion parameter models. This session will explore these ground-breaking advancements, highlight their practical application through compelling case studies, and conclude with an interactive Q&A.

Bio

Dr. Vinay Pondenkandath holds an M.Sc in Computer Science from TU Kaiserslautern, Germany, and a Ph.D. in Computer Science from the University of Fribourg. His research focuses on leveraging deep learning methods for computer vision and document image analysis, with emphasis on creating reproducible research workflows. Currently, Vinay serves as a Machine Learning Solutions Engineer at Cerebras Systems, collaborating with academic and industrial partners in the research and practical applications of large language models.