Observability

Observability built into every surface

Every agent, every channel, every customer interaction emits the metrics you need to improve. No agents to install, no integrations to wire up — it's the platform.

Three reasons it's actually different

We dogfood this for every Smoo AI agent. Same SDK, same catalog, same dashboards. One observability stack, two products.

Auto-context

Org, user, session, conversation, agent, channel, trace — all auto-attached from `@smooai/auth` + `@smooai/context`. You never re-pass identity.

Standard catalog

TTFT, TTFA, fast-router latency, tool latency, voice barge-ins, web page load — emitted out of the box across 30+ keys.

Dashboards that build themselves

Ask Smoo Agent: "show me conversation cost by channel for the last 7 days". Get a queryable, shareable dashboard back.

Smoo Agent — Data Tools

Build dashboards by chatting

Ask in plain English, watch them render, save them with one click. Every product page has a chat panel; every answer becomes a savable chart; the dashboard composes itself.

you ▸show me errors by service for the last 24h
smoo ◂[bar chart renders inline] — auth-service is the top group (47 events, 12 users). Save as widget?
you ▸now add a heatmap of p95 latency by hour
smoo ◂[heatmap appended to the same dashboard] — peaks at 14:00 UTC.

The Smoo Agent has 25 data-interrogator tools across metrics, logs, traces, errors, audit logs, and analytics. Per-org scoped; read-only by default; every answer is a savable dashboard widget.

Five lines, frontend or backend

The same `MetricsClient` interface ships in `@smooai/observability/browser` and `@smooai/observability/server`. Context is auto-attached from the auth packages you already use.

import { MetricsProvider, useMetricsClient } from '@smooai/observability/react';

// App root — once
<MetricsProvider options={{ sampleRate: 0.5 }}>{children}</MetricsProvider>

// Anywhere
const metrics = useMetricsClient();
metrics.timing('web.page.load.ms', performance.now(), { route: '/dashboard' });
metrics.counter('web.error.count', 1, { component: 'chat-widget' });

Standard metric catalog

Every metric documents its unit, kind, expected labels, and suggested aggregation. The Smoo Agent reads this catalog to suggest relevant series when you ask for a dashboard.

KeyUnitDescription
agent.ttft.msmsTime to first token (smart-tier LLM).
agent.ttfa.msmsTime to first audio chunk (voice pipeline).
agent.ack.latency.msmsUser message → channel ack ("looking that up…").
agent.router.latency.msmsFast-router classification time.
agent.turn.total.msmsEnd-to-end wall-clock for an agent turn.
agent.kb_search.msmsKnowledge-base hybrid lookup latency.
agent.tool.latency.msmsPer-tool execution latency.
agent.tool.failures.countcountTool exceptions / non-2xx returns.
agent.escalation.countcountTurns that escalated to a human.
agent.tokens.totaltokensTokens consumed (input + output).
agent.cost.centscentsCost in cents per LLM call.
agent.voice.barge_in.countcountStale-turn cancellations.
agent.voice.stale_dropped.countcountDropped audio chunks after stale detection.
web.page.load.msmsNavigation → load event.
web.hydration.msmsHTML received → React hydration.
web.api.latency.msmsBrowser fetch latency to SmooAI APIs.
web.ws.connect.msmsRealtime WebSocket connect duration.
web.error.countcountClient-side errors (error boundary).
sdk.push.latency.msmsSDK flush duration (self-meta).
sdk.queue.depth.gaugegaugeIn-memory queue depth (self-meta).

33 standard metrics in the catalog. See @smooai/observability for the full list and how to add your own.

Dashboards you'll have on day one

Curated templates ship with every account. Or describe what you want and Smoo Agent builds it for you.

Agent performance

TTFT, TTFA, router latency, KB hit rate — split by channel, model, and tier.

Channel mix

Conversation volume by webchat / SMS / voice / email — daily, weekly, monthly.

Cost per conversation

Derived from `agent.cost.cents` + `agent.turn.total.ms`. Drill in by org or agent.

Voice latency heatmap

Transcript, first-audio, barge-in shape — visualized as a time-series heatmap.

Error rate by component

Frontend error boundary catches + server 5xx, joined and bucketed.

How it flows

SDK → ingest API → ClickHouse → dashboards. Postgres fallback when you're on free tier or ClickHouse degrades.

emit() → in-memory queue → flush() → POST /organizations/{org}/metrics/push
                                                       ↓
                                              metrics ingest Lambda
                                                       ↓
                                ┌──────────────────────┴──────────────────────┐
                                ↓                                             ↓
              ClickHouse metrics_events                      metrics_events_fallback (Postgres)
              (paid tier — primary)                          (free tier OR ClickHouse degraded)
                                ↓
              metrics_5m / 1h / 1d rollups   ──►   /organizations/{org}/dashboards/{id}
              (materialized views)

Bundled into Smoo Metrics + Dashboards

$49/mo standalone, bundled at $79/mo in Smoo Observability Pro. Free tier covers 10k events/day with 30-day retention. Paid tiers ship the ClickHouse-backed primary store and the AI dashboard builder.