Staging environment

AI Evals for PMs Certification

Marily Nika, Ph.D AI/ML

GenAI Product Builder @ Google, ex-Meta

George Zoto

Senior Data, ML and GenAI Scientist

+ Diego Granados and Mark Cramer

This course is popular

8 people enrolled last week.

Eliminate uncertainty in shipping AI features

“Does it work?”... “Is it good enough?”... “Can we ship it?”...

How do you answer these questions for AI products? You’re responsible for “running evals” but what does that mean?

How do you choose the right metrics, interpret fuzzy results, and make a confident decision?

This course gives you a framework to do just that.

  • Map user value to evaluation (eval) objectives so your metrics aren’t abstract. Define success then translate it into measurable criteria.

  • Choose metrics you can actually maintain: capability, safety, UX friction, latency, cost and “does this reduce support tickets or increase activation.”

  • Set ship/no-ship thresholds you can defend to leadership.

  • Build lightweight workflows that work in real teams: human review where it matters, automation where it lasts, documentation that drives decisions.

  • Consider domain constraints (e.g., healthcare safety) and know what to avoid: silent failures, misleading proxy metrics and tests that don’t reflect production.

  • Tie everything to ROI: impact vs unit cost, eval coverage vs reliability, and the minimum viable monitoring you need post-launch.

Experience AI evals through a case-based approach with a real AI product that we evaluate together.

What you’ll learn

Acquire and develop a critical skill for product managers who are leading and contributing to AI products.

  • Learn a repeatable framework for deciding when an AI feature is ready to launch.

  • Tie decisions to user value, business goals, and measurable evaluation criteria.

  • Turn fuzzy product goals into concrete eval objectives and measurable success criteria.

  • Define “good enough” in plain language before choosing metrics or tools.

  • Use a PM-friendly menu of metrics to avoid misleading proxies and anchor on business value.

  • Balance capability, latency, UX friction, and cost without being an ML engineer.

  • Create ship/no-ship thresholds tied to KPIs, risk, and user impact.

  • Know when to stop tweaking prompts and when launch should be paused.

  • Learn what to automate, what to review manually, and how to design sustainable processes.

  • Produce datasets, golden examples, and error taxonomies your team can reuse.

  • Understand risks in sensitive domains like healthcare and finance.

  • Avoid silent failures, weak proxies, and tests that don’t reflect production.

Learn directly from expert instructors

Marily Nika, Ph.D AI/ML

Marily Nika, Ph.D AI/ML

GenAI Product Builder @ Google, ex-Meta

George Zoto

George Zoto

Senior NLP/AI Scientist

SHI International
National Association of REALTORS
Diego Granados

Diego Granados

Product Manager AI&ML @ Google

Mark Cramer

Mark Cramer

Sr. ML Product Manager @ Meta

Meta
Stanford University
MIT

Who this course is for

  • PMs leading AI features, growth, or platform initiatives

  • PMs who partner with ML teams and want to set evaluation standards

  • PMs who need to make clear “ship or hold” calls without doing the engineering

What's included

Live sessions

Learn directly from your instructors in a real-time, interactive format.

1:1 time with instructors

Book time with one of the instructors to ask questions, gain clarity, and level-up your learning.

Lifetime access

Go back to course content and recordings whenever you need to.

Community of peers

Stay accountable and share insights with like-minded professionals.

Certificate of completion

Share your new skills with your employer or on LinkedIn.

Maven Guarantee

This course is backed by the Maven Guarantee. Students are eligible for a full refund up until the halfway point of the course.

Course syllabus

5 live sessions • 23 lessons

Week 1

Dec 4—Dec 7

    Dec

    4

    AI Evals for PMs - AI Evals and You

    Thu 12/45:00 PM—6:30 PM (UTC)

    Intro Workshop: AI Evals and You

    1 item

    Survey

    1 item

    The role of Evaluations in AI Product Development

    4 items

    The AI Model Evals Playbook

    1 item

    Instructions: Get Opik API Key

    1 item

Week 2

Dec 8—Dec 14

    Dec

    9

    AI Evals for PMs - Foundations, Scoping and Datasets

    Tue 12/95:00 PM—6:30 PM (UTC)

    Session 1: Foundations, Scoping and Datasets

    3 items

    Exercise 1: Foundations, Scoping and Datasets

    1 item

    Dec

    11

    AI Evals for PMs - Metrics, Thresholds and Ship-readiness

    Thu 12/115:00 PM—6:30 PM (UTC)

    Session 2: Metrics, Thresholds and Ship-Readiness

    3 items

    Exercise 2: Metrics, Thresholds and Ship-readiness

    1 item

    Metrics Mastery

    2 items

Schedule

Live sessions

3 hrs / week

    • Thu, Dec 4

      5:00 PM—6:30 PM (UTC)

    • Tue, Dec 9

      5:00 PM—6:30 PM (UTC)

    • Thu, Dec 11

      5:00 PM—6:30 PM (UTC)

Projects

2 hrs / week

Async content

1 hr / week

Testimonials

  • Marily does an excellent job of striking a balance between providing easy-to-understand explanations and diving into technical depth. The hands-on projects allow students to apply what they learn and apply their understanding.

    Testimonial author image

    Dave

    Group Product Manager
  • Diego is a rare combination in a PM of deep technical understanding and passion for ensuring customers have an incredible experience with projects he was involved with.

    Testimonial author image

    Zach Cook

    Principal Product Manager

Frequently asked questions

$2,000

USD

Feb 10Feb 25
Enroll