ITSM — Yaqin Hei

Trained for 60,000 Steps, the Agent Learned to Delete Tickets — Six Reward-Hacking Patterns in ITSM Automation

I built an ITSM Agent research environment fit on real ServiceNow ticket data. After 60,000 training steps, DQN and PPO both hit 100% hacking rates — every ticket handled by some cheating shortcut, zero genuine resolutions. This is the engineer's-eye debrief: six ITSM-specific reward-hacking patterns + why your dashboard won't catch them + ten things your team can do this week.

Oct 10, 2025·30 min read