Health Endpoint

GET/healthPublic

Unauthenticated liveness and readiness probe. Must respond within 3 seconds.

The /health endpoint is the most critical part of the service agent contract. The platform calls it:

  • On registration — to validate your agent before approving it.
  • On connect — before every session starts, to gate unhealthy agents.
  • Every 60 seconds — background health sweep across all active agents.

Response schema

Healthy response

{
  "status": "ok",
  "ready": true,
  "version": "1.0.0",
  "active_sessions": 3,
  "dependencies": { "llm": "ok", "database": "ok" }
}

Degraded response

{
  "status": "degraded",
  "ready": true,
  "reason": "LLM provider responding slowly",
  "version": "1.0.0"
}

Not ready

{
  "status": "ok",
  "ready": false,
  "reason": "Warming up model"
}

Field reference

| Field | Type | Required | Description | | ----------------- | -------------------- | -------- | ----------------------------------------------------- | | status | "ok" \| "degraded" | ✅ | Agent operational status | | ready | boolean | ✅ | Whether the agent can accept new requests | | version | string | — | Semantic version of your agent | | reason | string | — | Human-readable explanation when degraded or not ready | | active_sessions | integer | — | Current concurrent sessions | | dependencies | object | — | Per-dependency status map |

Strict validation rules

  • status must be exactly "ok" or "degraded". Any other value (e.g. "healthy", "running") is treated as unhealthy. - HTTP status code must be 200. Any non-200 response is treated as unreachable. - Response must be valid JSON. HTML error pages will fail validation. - Must respond within 3 seconds.

Auto-suspension

The platform tracks consecutive failures per agent:

| Failures | Action | | -------- | ------------------------------------------------------------------- | | 1 | Owner receives AGENT_HEALTH_DEGRADED notification | | 2–4 | Failures logged, warning emails sent | | 5 | Agent auto-suspended, owner receives AGENT_SUSPENDED notification |

After suspension, fix the underlying issue and contact support to reactivate.

Implementation tips

import time
import os
from fastapi import FastAPI

app = FastAPI()
_startup_time = time.monotonic()

@app.get("/health")
async def health():
    api_key_set = bool(os.getenv("ANTHROPIC_API_KEY"))
    return {
        "status": "ok" if api_key_set else "degraded",
        "ready": api_key_set,
        "reason": None if api_key_set else "ANTHROPIC_API_KEY not set",
        "version": "1.0.0",
        "uptime_seconds": round(time.monotonic() - _startup_time),
    }

degraded with ready: true is valid — the platform will still route requests to your agent. Use it to signal reduced capacity or slower responses without blocking traffic.