Sentinel.AI Overview
Real-time health across all agents, workflows, and reliability systems
Success Rate
—
Agent runs (24h)
—
Open Incidents
—
Unresolved
—
Active Workflows
—
Running pipelines
—
Circuit Breakers
—
Open / Total
—
P95 Latency
—
ms
Total Tokens
—
All agents
Total Cost
—
USD (estimated)
DLQ Size
—
Failed tasks pending retry
Agent Breakdown
| Agent | Runs | Success | Avg Latency | Avg Cost |
|---|
Incident Types (Open)
Workflow Failure Replay
Run-level execution traces, step failures, and one-click replay from any failure point
Loading runs...
Select a run to view details
Agent Traces
Every agent run with Gantt timeline, checkpoints, and replay
Incidents
Agent loops, cascading failures, silent errors, latency spikes
Blast Radius Containment
If an agent fails, which downstream agents, users, and workflows are affected?
Select Agent to Analyze
Impact Summary
Select an agent and compute
Dependency Graph & Blast Radius
Compute blast radius to see the dependency graph
Reliability Guarantees
Circuit breakers, error budgets, dead letter queue, and retry policies
Circuit Breakers
Error Budgets (24h window)
Dead Letter Queue — Failed Tasks Awaiting Retry
Rollback & Replay
Every agent step is checkpointed. Replay from any point with modified inputs.
Select Trace to Replay
Replay Result
Select a trace and click Replay on any checkpoint
Service Level Objectives
Agent reliability targets with error budgets and burn rate alerts
Per-Agent SLO Targets
| Agent | Success Rate Target | P95 Latency Target | Tool Failure Target |
|---|---|---|---|
| support-agent | 99.5% | 3,000ms | <0.5% |
| code-assistant | 98.0% | 10,000ms | <2% |
| data-analyst | 97.0% | 15,000ms | <3% |
| orchestrator | 99.9% | 30,000ms | <0.1% |
| default | 99.0% | 5,000ms | <1% |
Agent Contracts & Handoff Validation
Registered input/output schemas — every handoff is validated before delivery
Registered Contracts
Loading...
Handoff Audit Log
Every agent-to-agent transition — accepted and rejected
Loading...
Atomic State Coordination
Versioned shared state — concurrent agents can't silently overwrite each other
How it works
Every write is a propose — it only commits if the version hasn't changed since you read it.
If another agent wrote first, you get a 409 Conflict and must re-read and retry.
This prevents the silent overwrite bug where two agents process the same state and one destroys the other's work.
Active State Keys
No state keys yet. Use
POST /api/state/:namespace/:key/propose to write state.Settings
Manage your API keys
Active API Key
Your API key is stored in your browser only. Required to use ingest endpoints, state, contracts, and handoffs.
Generate a New Key
Give your key a name to identify it later (e.g. "my-pipeline", "demo").
Your Keys
Loading...
Danger Zone
Reset clears all runs, traces, incidents, state, and handoffs associated with your API key. This cannot be undone.