Common
The Gateway listens on https://localhost:8006 by default. All endpoints below are relative to the Gateway.
Service Management APIs
These endpoints manage system status, queuing, configuration, and resources. They are independent of interaction mode.
Health & Status
GET /health
Health check.
Response: {"status": "ok"}
GET /status
Global service status summary.
Response:
{
"total_workers": 4,
"idle_workers": 2,
"busy_workers": 2,
"queue_length": 3,
"workers": [...]
}
GET /workers
Detailed Worker list.
Response:
{
"workers": [
{
"url": "http://localhost:22400",
"index": 0,
"status": "idle",
"current_task": null,
"current_session_id": null,
"cached_hash": "abc123",
"busy_since": null
}
]
}
Queue Management
The Gateway uses a FIFO queue to manage concurrent requests. All interaction modes share a single queue.
GET /api/queue
Get a snapshot of the queue status.
Response:
{
"queue_length": 3,
"entries": [
{
"ticket_id": "tk_001",
"position": 1,
"task_type": "half_duplex",
"eta_seconds": 15.0
}
],
"running": [
{
"worker_url": "http://localhost:22400",
"task_type": "half_duplex",
"session_id": "stream_xyz",
"started_at": "2026-02-24T10:30:00Z",
"elapsed_s": 5.2
}
]
}
GET /api/queue/{ticket_id}
Query the status of a specific queue ticket.
DELETE /api/queue/{ticket_id}
Cancel a queued request.
ETA Configuration
ETA (Estimated Time of Arrival) baselines for each request type, refined at runtime via Exponential Moving Average.
GET /api/config/eta
Response:
{
"eta_chat_s": 15.0,
"eta_half_duplex_s": 180.0,
"eta_audio_duplex_s": 120.0,
"eta_omni_duplex_s": 90.0,
"eta_ema_alpha": 0.3,
"eta_ema_min_samples": 3
}
PUT /api/config/eta
Update ETA configuration. Request body uses the same fields as the GET response.
KV Cache
GET /api/cache
Query the KV Cache status of all Workers.
Configuration & Presets
GET /api/frontend_defaults
Get frontend default configuration values.
Response:
{
"playback_delay_ms": 200,
"chat_vocoder": "token2wav"
}
GET /api/presets
Get the list of System Prompt presets.
Response:
[
{
"id": "default_en",
"name": "English Assistant",
"system_prompt": "You are a helpful assistant."
}
]
Session Management
Sessions are automatically recorded for playback and debugging.
GET /api/sessions/{session_id}
Get session metadata.
Response:
{
"session_id": "omni_abc123",
"type": "omni_duplex",
"created_at": "2026-02-24T10:00:00Z",
"config": {}
}
GET /api/sessions/{session_id}/recording
Get session recording timeline data.
GET /api/sessions/{session_id}/assets/{relative_path}
Get session asset files (audio/video chunks, etc.).
GET /api/sessions/{session_id}/download
Download the entire session as a package.
POST /api/sessions/{session_id}/upload-recording
Upload frontend-recorded audio/video files. Size limit: 200 MB.
App Management
Control which interaction modes are available in the frontend.
GET /api/apps
Get the list of enabled apps (for frontend use).
GET /api/admin/apps
Get the list of all apps including enabled status (for Admin use).
PUT /api/admin/apps
Toggle app enabled status.
Request Body:
{
"app_id": "omni",
"enabled": false
}