A Field Guide to Rapidly Improving AI Products

A Field Guide to Rapidly Improving AI Products – Hamel’s Blog

RelatedInsightsCollectionsHighlights

I’ve noticed that many GenAI application projects put in automated evaluations (evals) of the system’s output probably later — and rely on humans to judge outputs longer — than they should. This is because building evals is viewed as a massive investment (say, creating 100 or 1,000 examples, and designing and validating metrics) and there’s never a... See more

Andrew Ng x.com

What exactly are evals?

Evals are how you measure the quality and effectiveness of your AI system. They act like regression tests or benchmarks, clearly defining what “good” actually looks like for your AI product beyond the kind of simple latency or pass/fail checks you’d usually use for software.

Evaluating AI systems is less like traditional soft... See more

Aman Khan • Beyond Vibe Checks: A PM’s Complete Guide to Evals

Building in generative AI is like running on a treadmill while traditional tech moves at walking speed. This speed impacts everything from the technical problems you tackle to your timeline for reaching scale. While this acceleration should change your strategy, it doesn’t change the fundamentals of building a good product. You need to build someth... See more

Aman Khan • Beyond Vibe Checks: A PM’s Complete Guide to Evals

How to Build a Truly Useful AI Product