ICCD 2023

Azalia Mirhoseini

Stanford University

Title: Pushing the Limits of Scaling Laws in the Age of Large Language Models

Abstract

The recent success of large language models has been characterized by scaling laws – the power law relationship between performance and training dataset size, model parameter size, and training compute. In this talk, we will discuss ways to push the scaling laws even further by innovating across data, models, software and hardware. This includes reinforcement learning from human and AI feedback to improve learning efficiency, sparse and dynamic mixture-of-experts neural architectures for better performance, an automated framework for co-designing custom AI accelerators, and a deep RL method for chip floorplanning used in multiple generations of Google AI’s accelerator chips (TPU). Through these cutting-edge examples, we will outline a full-stack approach that leverages AI to overcome the next set of scaling challenges.

Bio

Azalia Mirhoseini is an Assistant Professor in the Computer Science Department at Stanford University and a senior staff research scientist at Google DeepMind. Professor Mirhoseini's research interest is in developing capable, reliable, and efficient AI systems for solving high-impact, real-world problems. Her work includes generalized learning-based methods for decision-making problems in systems and chip design, self-improving AI models through interactions with the world, and scalable deep learning optimization. Prior to Stanford, she spent several years in industry AI labs, including Anthropic and Google Brain. At Anthropic, she worked on advancing the capabilities and reliability of large language models. At Google Brain, she co-founded the ML for Systems team, with a focus on automating and optimizing computer systems and chip design. She received her BSc degree in Electrical Engineering from Sharif University of Technology and her PhD in Electrical and Computer Engineering from Rice University. Her work has been recognized through the MIT Technology Review’s 35 Under 35 Award, the Best ECE Thesis Award at Rice University, publications in flagship venues such as Nature, and coverage by various media outlets, including MIT Technology Review, IEEE Spectrum, The Verge, The Times, ZDNet, VentureBeat, and WIRED.

The 41^st IEEE International Conference on Computer Design

November 6 - 8, 2023

Azalia Mirhoseini

Stanford University

Title: Pushing the Limits of Scaling Laws in the Age of Large Language Models

Abstract

Bio

ICCD 2023

The 41st IEEE International Conference on Computer Design

November 6 - 8, 2023

Azalia Mirhoseini

Stanford University

Title: Pushing the Limits of Scaling Laws in the Age of Large Language Models

Abstract

Bio

The 41^st IEEE International Conference on Computer Design