Latency First: How to Actually Make RAG & Agents Fast
Hosted by Jason Liu and Aarush Sah
Learn directly from Jason Liu and Aarush Sah
Go deeper with a course
475 students
Go deeper with a course
What you'll learn
Measure RAG & Agent Performance Effectively
Students will learn to distinguish between TTFT, TPS, and step latency to benchmark AI systems.
Identify Latency Bottlenecks in AI Pipelines
Students will learn to diagnose slowdown points in RAG workflows and multi-step agents.
Apply Practical Optimization Techniques
Students will master stack-agnostic strategies to reduce response times while maintaining high-quality AI outputs.
Why this topic matters
Latency is the silent killer of AI adoption. Users abandon systems that make them wait, regardless of accuracy. By mastering performance optimization, you'll deliver solutions people actually use, overcome the primary barrier to production AI success, and develop a professional edge that distinguishes you in a market fixated on capability rather than usability.
You'll learn from
Jason Liu
Consultant at the intersection of Information Retrieval and AI
Jason has built search and recommendation systems for the past 6 years. He has consulted and advised a dozens startups in the last year to improve their RAG systems. He is the creator of the Instructor Python library.
Aarush Sah
Head of Evals, Groq
Worked with
By continuing, you agree to Maven's Terms and Privacy Policy.