Staging environment

Understanding Innovations leading up to DeepSeek R1

Hosted by Amir Feizpour and Suhas Pai

249 students

Share this lesson

Go deeper with a course

Build Multi-Agent Applications - A Bootcamp
Amir Feizpour, PhD and Abhimanyu Anand

What you'll learn

What data and training innovations underly R1.

What architecture choices enable R1's performance.

What these innovations mean for the community.

Why this topic matters

We will be looking into some of the technical choices that empowered the impressive performance of the base model of R1 like Multi-head Latent Attention, Load balancing for MoE models, Fill-in-the-middle Learning Objective, FP8 training, and Multi-token Prediction.

You'll learn from

Amir Feizpour

Founder @ Aggregate Intellect

Amir Feizpour is the founder, CEO, and Chief Scientist at Aggregate Intellect building a generative business brain for service and science based companies. Amir has built and grown a global community of 5000+ AI practitioners and researchers gathered around topics in AI research, engineering, product development, and responsibility. Prior to this, Amir was an NLP Product Lead at Royal Bank of Canada. Amir held a research position at University of Oxford conducting experiments on quantum computing resulting in high profile publications and patents. Amir holds a PhD in Physics from University of Toronto. Amir also serves the AI ecosystem as an advisor at MaRS Discovery District, works with several startups as fractional chief AI officer, and engages with a wide range of community audiences (business executives to hands-on developers) through training and educational programs. Amir leads Aggregate Intellect’s R&D via several academic collaborations.

Suhas Pai

CTO @ Hudson Labs

Suhas is the CTO & Co-founder of Hudson Labs, an NLP startup operating in the financial domain, where he conducts research on LLMs, domain adaptation, text ranking, and more. He was the co-chair of the Privacy WG at BigScience, the chair at TMLS 2022 and TMLS NLP 2022 conferences, and is currently writing a book on Large Language Models.

Watch the recording for free

By continuing, you agree to Maven's Terms and Privacy Policy.

© 2025 Maven Learning, Inc.