Staging environment

Evaluate & Improve AI agents with Confidence

Hosted by Mahesh Yadav

372 students

Share this lesson

Go deeper with a course

Master Agentic AI PM with MAANG product leader
Mahesh Yadav

What you'll learn

Assess LLM Suitability for Your Agentic AI

Benchmark your AI’s performance, adaptability, and decision-making quality.

Design a Manual Evaluation Framework for AI Agents

Implement a structured review process for agentic AI performance.

Automate AI Evaluation with Observability & LLM Judges

Use LLMs as autonomous “judges” to scale AI performance assessments.

Why this topic matters

AI agents are only as good as their decision-making, and without proper evaluation, they often fail in real-world applications. Large language models can behave unpredictably, making it essential to have a structured evaluation framework that ensures reliability, adaptability, and performance. You'll learn most of these things with this insightful session.

You'll learn from

Mahesh Yadav

Ex- GenAI Product Lead at MAANG Level Firms l AI PM Coach

Mahesh has 20 years of experience in building products at Meta, Microsoft and AWS AI teams. Mahesh has worked in all layers of the AI stack from AI chips to LLM and has a deep understanding of how using AI agents companies ship value to customers. His work on AI has been featured in the Nvidia GTC conference, Microsoft Build, and Meta blogs.


His mentorship has helped various students in building Real time products & Career in Agentic AI PM space.

Previously at

Meta
Amazon Web Services
Microsoft

Watch the recording for free

By continuing, you agree to Maven's Terms and Privacy Policy.

© 2025 Maven Learning, Inc.