Execution plan
Roadmap and claims discipline
Ship the mature guarantees first. Keep research-grade oversight ideas in R&D until they outperform simpler verifier-router baselines.
Implementation phases
0-3 months
Ship certified tool-call guardrails
Policy-as-code at the gateway with fail-closed decisions, signed logs, tenant isolation, and replay.
2-6 months
Add conformal escalation
Calibrate action-risk scores on reviewed actions. Expose target error and review-rate controls.
4-8 months
Route to the right expert
Use learning-to-defer with workload constraints for legal, finance, security, and business owners.
ongoing
Compress review time
Generate dry runs, failed-invariant summaries, side-effect estimates, rollback plans, and labels for active learning.
12+ months
Keep debate and recursive reward modeling in R&D
Productize only when training-on-protocol shows robust gains over verifier-plus-router baselines.
Claims discipline
- Soundness applies only to formalized properties; unmodeled failure modes require monitoring and review.
- Conformal guarantees depend on exchangeability; drift monitoring is a product requirement.
- Verification is cheaper than generation for many policy checks, but not every hidden side-effect question.
- Legibility can carry a performance tax; measure task success and review time together.
- Acquisition values and market estimates should be labeled as reported estimates in sales collateral.