Systems Seminar

EPFL IC Systems Seminar

Deterministic Low Latency Supercomputing at Scale



Abstract

In this talk I will describe Groq Dataflow computing, deterministic program execution and how it leads to matrix computations and neural networks at scale. The Groq architecture is a direct consequence of the end of Moore’s law, combined with the end of multicore performance scaling. To address the compute challenges of the future, the hardware is structured to maximize the number of arithmetic units, the amount of SRAM memory per chip, and allow for sufficient data shuffling bandwidth in-between. Groq’s game changing performance leads to new opportunities in the rapidly converging worlds of HPC and AI. I will show applications from basic matrix computations to simulating quantum computers and execution of neural networks. As with any new technology, emerging conflicts create a conflict cycle which needs to be addressed faster than it arises. Groq software, the compiler and programming tools, resolve the core conflicts of new computing technology: reduced precision, massive parallelism, and the bandwidth bottleneck to get data on and off the chip. Ultimately, Groq technology makes raw performance easy to use, desirable and affordable while enabling applications at a scale that otherwise would not be achievable.

Bio

Oskar Mencer got a B.Sc. from the Technion in Israel and a PhD in Computer Engineering at Stanford University, developing the basis for reduced precision dataflow computing. In 2003, at the Computing Sciences Center at Bell Labs, Oskar took his research out and founded Maxeler Technologies, which delivered production dataflow machines to JP Morgan, Chevron, ENI, Citibank, UK Daresbury Labs, and the German supercomputing center in Jülich. Oskar is a member of Academia Europaea. In 2022, Maxeler got acquired by Groq, and continues to build maximum performance solutions for HPC and AI.