Yaqin Hei
Agent AuditSeriesVideosAbout
← All posts

Progressive Rollout

1 post

A Pretty Accuracy Number Hid Dozens of Money-Moving Errors — How to Read the Eval to Ship

On a money-moving project I ran, the overall accuracy looked great; but pull the money-moving intents out on their own and the wrong-action rate was alarming — dozens of money-touching errors sat there the whole time, hidden by one blended number. In 5 minutes you'll see through "one accuracy figure to request launch"; in 10 you'll put a separate wrong-action gate on money-moving errors; in 20 you'll have a launch-decision flow: CI lower bound + per-scenario version cut + per-channel ramp.

Jul 5, 2026·13 min read

微信公众号 京墨AI研习社 @HeiLabAI · 视频号 Yaqin.AI

X @yaqinhei · GitHub @AmyHei · amyheiny@gmail.com

© 2026 Yaqin Hei · About