Full observability for your LiveKit voice agents
Connect your LiveKit Agents to Tuner in two lines of code. Every session’s transcript, latency, usage, and cost is captured automatically, so you catch hallucinations, broken flows, and missed intents before your callers do.
pip install tuner-livekit-sdk
from tuner import TunerPlugin
async def entrypoint(ctx: JobContext):
session = AgentSession(…)
TunerPlugin(session, ctx) # wires itself automatically
await session.start(…)
Integrate in under two minutes
No re-architecting your pipeline. Tuner attaches to your existing LiveKit agent and starts capturing production data immediately.
01
Install the SDK
Add tuner-livekit-sdk to your Python or Node.js LiveKit Agents project. Works with livekit-agents v1.4 and later.
02
Set your credentials
Drop in your Tuner API key, workspace ID, and agent ID — via environment variables or inline in code.
03
Add the plugin
Add TunerPlugin right after creating your AgentSession. It listens to session events and submits call data when the session ends.
04
See every call in Tuner
Transcripts, latency, usage, and cost flow into your dashboard automatically — no manual API calls, ready to analyze and monitor.
Read the LiveKit guide →
One dashboard
Every LiveKit call, scored automatically
Transcripts, latency, usage, and red flags from every session land in one place — so quiet failures surface before they reach your churn data.

Why you need Tuner
Voice agents fail quietly, and at a scale no team can review by hand. Tuner turns every production call into signal you can debug, alert on, test against, and improve.
01
Debug in minutes, not days
When a call goes wrong, see exactly what happened and where — the full transcript, every turn, latency at each step, tool calls, and conversation state. No more guessing from sparse logs.
02
Get alerted the moment something breaks
Don’t wait for a customer complaint. Configure your own alerts with multiple triggers and conditions — by red flag, metric threshold, agent, or flow — and get notified the instant quality slips.
03
Test before you ship
Run call simulations and automated checks over SIP before every launch and after every change, scored against the same evals that monitor production — so you catch regressions before your callers do.
04
Diagnose your agent at scale
At thousands of calls a day, manual validation is impossible. Tuner finds the patterns, pinpoints where your agent breaks, and suggests how to fix it — so one engineer can stay on top of production.
05
Analytics that explain production
Understand how your agent actually behaves live: where it breaks, when callers get frustrated, which flows are missing, when it hallucinates, and which tool calls fail the most.
Everything you need to run LiveKit agents in production
Turn production from a black box into something you can actually monitor, measure, and improve.
Catch failures early
Hallucinations, broken flows, dead air, early hangups, and missed intents are flagged automatically — before they reach your churn data.
Component-level latency
See STT, TTS, and LLM latency broken out at p50 and p90, so you know exactly where conversations slow down.
Real-time alerts
Get notified the moment red flags or failed evals appear in production, instead of weeks later buried in logs.
Call simulation
Stress-test your agent over SIP before launch and after every change, scored against the same evals that monitor live traffic.
LangGraph & LangChain capture
Record node transitions, tool calls, and timing alongside session data when your agent uses a LangGraph or LangChain logic layer.
Cost & usage per call
Attach a cost calculator and track LLM, TTS, and STT spend on every session — no separate billing pipeline required.
Ship with confidence
Catch regressions the moment a new version ships
Compare agent versions across success rate, red-flag rate, and cost per call, and see exactly which failure types are driving the drop.

Tuner vs LiveKit Cloud Observability
LiveKit’s built-in observability covers agents hosted on LiveKit Cloud — infra metrics, recordings, and traces. Tuner adds the production quality layer, across every stack you build on.
Vendor-independent observability, eliminating the conflict of a platform evaluating its own output
✓
Evals pricing built for scale: tuner price per call, no per minute surcharge
✓
Voice red flags (hallucination, dead air, early hangup)
✓
Root-cause diagnosis with a specific fix, not just metrics
✓
30+ voice quality metrics & red flags out of the box
✓
Drift & regression alerts over time
✓
SIP call simulations with AI agents, using your live evals
✓
Turn-by-turn transcripts & latency traces
✓
Frequently asked questions
Which LiveKit versions are supported?
+
Do I have to change my agent code?
+
What gets captured?
+
Can I test my agent before going live?
+
How long does setup take?
+
Does it work with SIP / phone calls?
+
Does Tuner support alerts and monitoring?
+
Can I define my own evaluations and metrics?
+
How is Tuner priced?
+
Is my call data private and secure?
+
