Changelog

What’s new in EvalOps

Release notes, platform improvements, and new content announcements.

October 3, 2025

Version 2025.10.03

EvalOps Agent Launch & Telemetry Upgrades

✦Released the EvalOps Agent to orchestrate evaluation suites, scorecards, and release gates
✦Shipped first-class telemetry connectors for Slack, GitHub, and PagerDuty so incidents stay tied to evals
✦Introduced evaluation dataset versioning and shadow-run diffing for safer prompt/weight changes
✦Added governance attestations API to capture review sign-off alongside evaluation evidence

EvalOps Agent

•Agent now schedules capture → evaluate → decide loops automatically for every workspace
•Launch-ready gate policies block deploys when safety or quality thresholds slip
•Slack digests summarize failing evals with direct links back to scorecards

Telemetry

•New GitHub App ingests pull-request context so evaluation results map to diffs
•PagerDuty integration posts paging events into eval timelines for root-cause correlation
•Expanded Spellbook recipes with dataset version pinning and replay controls

Governance

•Attestations API stores approver, policy, and evidence payload for each release gate
•Audit log now captures evaluator overrides with before/after snapshots
•Trust Center widgets display live control coverage, driven directly from eval outcomes

Developer Experience

•CLI gained eval-agent commands to kick off suites locally and stream results