Releases: livekit/agents
livekit-agents@1.5.8
What's Changed
- feat(interruption): barge-in cooldown window for corrections by @chenghao-mou in #5269
- fix(amd): amd improvement (AGT-2777) by @chenghao-mou in #5584
- fix(warm_transfer): don't fall back to env var when sip_connection is set by @longcw in #5619
- fix(aws): wait for stream ready before sending audio start event by @lanazhang in #5626
- Fix Missing user message metrics (MetricsReport) due to early returns in _user_turn_completed_task and no initialization in on_end_of_turn by @hudson-worden in #5437
- fix(amd): missing stt start by @chenghao-mou in #5633
- fix: reduce overly eager call ending behavior by @davidzhao in #5630
- feat(fishaudio): use websocket API for faster inference by @davidzhao in #5629
- fix(observability): retry session recording upload by @paulwe in #5627
- fix(openai realtime): reject pending response future on error event by @longcw in #5576
- feat(inference): propagate STT extra to SpeechData.metadata by @russellmartin-livekit in #5639
- Update README.md by @theomonnom in #5640
- fix(amd): reset timer for late stt transcript by @chenghao-mou in #5637
- fix: end Runway realtime sessions on shutdown by @robinandeer in #5623
- ci(examples): add deploy workflow by @tinalenguyen in #5641
- feat(amd): add remote session event for amd AGT-2828 by @chenghao-mou in #5621
- Add Soniox TTS plugin by @matejmarinko-soniox in #5543
- (inworld tts): add new model by @tinalenguyen in #5646
- livekit-agents@1.5.8 by @github-actions[bot] in #5647
New Contributors
- @lanazhang made their first contribution in #5626
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.7...livekit-agents@1.5.8
livekit-agents@1.5.7
What's Changed
- fix(openai): forward session.update on RealtimeModel.update_options by @longcw in #5531
- fix(transcription): seed _start_wall_time fallback in aclose by @longcw in #5532
- Fix realtime reply generation after interruption by @jayeshp19 in #5526
- fix(cartesia): Move API key from Query Params to Headers by @charlotte-zhuang in #5516
- deepgram-stt: report connection-lifetime remainder so usage matches billing by @joaquinhuigomez in #5506
- feat(room-io): add json_format option for timed transcription output by @longcw in #5472
- feat(inference): add inference_class option to LLM for priority routing by @adrian-cowham in #5517
- chore: update default model for Anthropic LLM by @royalfig in #5539
- fix(voice): pause output when user starts speaking during thinking by @longcw in #5535
- feat(openai): add gpt-5.4-mini to model registry by @xtreme-sameer-vohra in #5540
- feat(assemblyai): warn when audio stops flowing to the WebSocket by @gsharp-aai in #5504
- feat(tts): add support for timestamps in Inference by @chenghao-mou in #5534
- docs: clarify RunResult.events testing surface by @Rul1an in #5525
- feat(stt): back-date START_OF_SPEECH onset via server-provided timestamp by @gsharp-aai in #5479
- feat(aws): add auto language detection and mid-stream language switchโฆ by @cldsime in #5435
- (release workflow): add docs job by @tinalenguyen in #5551
- (liveavatar): add video_quality param by @tinalenguyen in #5552
- Add avatartalk plugin to optional dependencies by @bcherry in #5550
- fix(soniox): emit PREFLIGHT_TRANSCRIPT for preemptive LLM generation by @octo-patch in #5553
- feat(xai): support model selection in realtime, default to grok-voice-think-fast-1.0 by @Hormold in #5548
- Remove 'distil-whisper-large-v3-en' from STTModels by @vedevpatel in #5537
- fix: don't swallow _ExitCli during shutdown by @lawrence3699 in #5519
- feat: expose provider request ids on STT/TTS/LLM spans for debugging by @longcw in #5546
- chore(openai): remove STT.with_groq constructor by @davidzhao in #5555
- chore(deps): update github actions (major) by @renovate[bot] in #5558
- feat(mcp): allow updating headers on MCPServerHTTP by @longcw in #5559
- feat(metrics): add playback_latency metric by @longcw in #5524
- feat(endpointing): expose dynamic endpointing alpha parameter (AGT-2764) by @chenghao-mou in #5491
- fix(smallestai): use close_stream signal to properly terminate STT session by @harshitajain165 in #5562
- Hotfix; Updated default Avatar ID by @hari-truviz in #5568
- fix(gemini live): use parameters instead of parameters_json_schema for raw schema function tools by @longcw in #5560
- Stuck aclose() activity leading to stuck handoff by @svacatalisan in #4649
- fix(async_toolset): respect allow_interruptions when cancelling tool calls by @longcw in #5570
- update livekit rtc to 1.1.7 by @davidzhao in #5572
- feat(mistral): add connectors provider tool & fix realtime STT custom headers by @jeanprbt in #5575
- feat(openai): expose verbosity in Responses LLM by @AlessandroElyos in #5583
- fix(mistral): use conversations API statelessly by @TheCodingCvrlo in #5586
- support LIVEKIT_AGENT_NAME env var by @theomonnom in #5571
- fix(recorder): use libopus when possible by @chenghao-mou in #5579
- docs: add LIVEKIT_AGENT_NAME to environment variables by @detail-app[bot] in #5599
- fix(elevenlabs): use audio_format query param for STT realtime by @longcw in #5574
- fix: clear stale paused speech state across generation steps by @longcw in #5594
- fix: cancel Runway realtime sessions on shutdown by @robinandeer in #5612
- fix(inference): skip unknown message warning and rename event name by @chenghao-mou in #5614
- feat: add SLNG plugin for STT and TTS by @metehan-slng in #5249
- livekit-agents@1.5.7 by @github-actions[bot] in #5615
New Contributors
- @charlotte-zhuang made their first contribution in #5516
- @xtreme-sameer-vohra made their first contribution in #5540
- @Rul1an made their first contribution in #5525
- @cldsime made their first contribution in #5435
- @octo-patch made their first contribution in #5553
- @vedevpatel made their first contribution in #5537
- @lawrence3699 made their first contribution in #5519
- @svacatalisan made their first contribution in #4649
- @AlessandroElyos made their first contribution in #5583
- @TheCodingCvrlo made their first contribution in #5586
- @detail-app[bot] made their first contribution in #5599
- @metehan-slng made their first contribution in #5249
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.6...livekit-agents@1.5.7
livekit-agents@1.5.6
What's Changed
- Add Qwen 3 TTS support for Simplismart-livekit plugin by @simplipratik in #5474
- Add Inworld STT provider to livekit-plugins-inworld by @cshape in #5451
- (minimax): add new TTS models by @tinalenguyen in #5518
- feat(smallestai): add Pulse STT with real-time streaming and batch transcription by @harshitajain165 in #5312
- feat(avatar): add playback_started RPC for remote avatar workers by @longcw in #5511
- fix: clear _hist buffer in MovingAverage.reset() to prevent stale averages by @kuishou68 in #5522
- feat(mistral): migrate LLM to Conversations API with provider tools support by @jeanprbt in #5527
- livekit-agents@1.5.6 by @github-actions[bot] in #5528
New Contributors
- @simplipratik made their first contribution in #5474
- @harshitajain165 made their first contribution in #5312
- @kuishou68 made their first contribution in #5522
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.5...livekit-agents@1.5.6
livekit-agents@1.5.5
What's Changed
- feat(inference): STT diarization capabilities and speaker_id on TimedString, add xAI TTS support for inference by @russellmartin-livekit in #5438
- [inworld] timed_string to no longer have trailing spaces by @ianbbqzy in #5470
- fix(examples): update e2ee.py to use encryption kwarg and env var by @aryeila in #5469
- chore(deps): update dependency pillow to v12.2.0 [security] by @renovate[bot] in #5440
- fix(tests): update preemptive_generation mock to use dict by @longcw in #5468
- fix(telemetry): bound OTel provider shutdown to avoid watchdog kills by @theomonnom in #5471
- feat(assemblyai): log connection lifecycle, silence, and session correlators by @dlange-aai in #5476
- fix: strip markdown emphasis adjacent to punctuation by @carschandler in #5481
- (aws realtime): add expiry check for cached credentials by @tinalenguyen in #5485
- (hedra): note deprecation in readme by @tinalenguyen in #5475
- (deepgram sttv2): add flux-general-multi support by @tinalenguyen in #5486
- (xai stt): expose endpointing param to user by @tinalenguyen in #5493
- fix(room-io): ownership-aware FrameProcessor lifecycle management by @longcw in #5467
- (openai responses): drop prompt_cache_retention in received responses by @tinalenguyen in #5502
- feat(avatar): add AvatarSession base class, warn on sync mis-wire by @longcw in #5499
- livekit-agents@1.5.5 by @github-actions[bot] in #5503
New Contributors
- @carschandler made their first contribution in #5481
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.4...livekit-agents@1.5.5
livekit-agents@1.5.4
New features
Preemptive generation: added more granular options
Refines default behavior for preemptive generation to better handle long or intermittent user speech, reducing unnecessary downstream inference and associated cost increases.
Also introduces PreemptiveGenerationOptions for developers who need fine-grained control over this behavior.
class PreemptiveGenerationOptions(TypedDict, total=False):
"""Configuration for preemptive generation."""
enabled: bool
"""Whether preemptive generation is enabled. Defaults to ``True``."""
preemptive_tts: bool
"""Whether to also run TTS preemptively before the turn is confirmed.
When ``False`` (default), only LLM runs preemptively; TTS starts once the
turn is confirmed and the speech is scheduled."""
max_speech_duration: float
"""Maximum user speech duration (s) for which preemptive generation
is attempted. Beyond this threshold, preemptive generation is skipped
since long utterances are more likely to change and users may expect
slower responses. Defaults to ``10.0``."""
max_retries: int
"""Maximum number of preemptive generation attempts per user turn.
The counter resets when the turn completes. Defaults to ``3``."""What's Changed
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.3...livekit-agents@1.5.4
livekit-agents@1.5.3
Note
livekit-agents 1.5 introduced many new features. You can check out the changelog here.
What's Changed
- feat(amd): add OTEL span and tag for AMD by @chenghao-mou in #5376
- fix(openai): prepend session instructions in realtime generate_reply by @longcw in #5394
- fix: AgentTask deadlock when on_enter awaits generate_reply that triggers another AgentTask by @longcw in #5377
- telemetry: emit OTel span events for developer-role messages by @joaquinhuigomez in #5403
- feat(realtime): reuse realtime session across agent handoffs if supported by @longcw in #5229
- fix(llm): handle double-encoded JSON tool arguments from providers by @prettyprettyprettygood in #5409
- chore: exposed the max session duration in with_azure() function by @k-1208 in #5383
- (gemini live 3.1): fix tool responses by @tinalenguyen in #5413
- fix(google): capture usage_metadata before early continues in streaming by @Panmax in #5404
- fix(voice): block new user turns immediately on update_agent() to prevent transition delay by @Panmax in #5396
- fix(tests): update drive thru instructions by @chenghao-mou in #5405
- feat(inference): handle preflight_transcript in inference STT plugin by @adrian-cowham in #5412
- fix(aws): unwrap doubly-encoded JSON tool arguments from Nova Sonic by @rililinx in #5411
- chore: pin GHA by commit by @davidzhao in #5415
- chore(deps): update dependency langchain-core to v1.2.28 [security] by @renovate[bot] in #5417
- chore(deps): update dependency aiohttp to v3.13.4 [security] by @renovate[bot] in #5416
- chore(deps): update dependency nltk to v3.9.4 [security] by @renovate[bot] in #5418
- (azure openai): ensure gpt-realtime-1.5 compatibility by @tinalenguyen in #5407
- chore(deps): update github workflows (major) by @renovate[bot] in #5424
- update: Sarvam STT - add verbose error loggin and remove retry connection by @dhruvladia-sarvam in #5373
- fix(inworld): do not leak connections when when cancelled by @davidzhao in #5427
- feat: add service_tier parameter to Responses API LLM by @piyush-gambhir in #5346
- Feature/krisp viva sdk support by @realgarik in #4370
- fix: empty transcript blocks commit_user_turn until timeout by @longcw in #5429
- fix: allow multiple AsyncToolsets by deduplicating management tools by @longcw in #5369
- feat(beta/workflows): add InstructionParts for modular instruction customization by @longcw in #5077
- add ToolSearchToolset and ToolProxyToolset for dynamic tool discovery by @longcw in #5140
- Feature - Configurable session close transcript timeout by @bml1g12 in #5328
- Fix FrameProcessor lifecycle for selector based noise cancellation by @Topherhindman in #5433
- feat: add Runway Characters avatar plugin by @robinandeer in #5355
- Rename e2ee to encryption in JobContext.connect by @longcw in #5454
- chore: reduce renovate noise by @davidzhao in #5421
- fix(liveavatar): wait for connected state and chunk audio before sending by @dyi1 in #5453
- (phonic): support realtimemodel say() by @tinalenguyen in #5293
- feat: add Cerebras LLM plugin by @u9g in #5456
- (google tts): add "gemini-3.1-flash-tts-preview" model by @tinalenguyen in #5459
- (hedra) remove from examples and raise exception by @tinalenguyen in #5460
- xai stt by @tinalenguyen in #5458
- feat(openai): expose max_output_tokens on Responses API LLM by @piyush-gambhir in #5449
- (xai stt): pass diarization capability + minor fix by @tinalenguyen in #5461
- Adding xAI Grok llm support for inference by @russellmartin-livekit in #5201
- (release workflow): allow spacing in dependency by @tinalenguyen in #5463
- livekit-agents@1.5.3 by @github-actions[bot] in #5464
New Contributors
- @joaquinhuigomez made their first contribution in #5403
- @k-1208 made their first contribution in #5383
- @rililinx made their first contribution in #5411
- @renovate[bot] made their first contribution in #5417
- @realgarik made their first contribution in #4370
- @robinandeer made their first contribution in #5355
- @dyi1 made their first contribution in #5453
- @u9g made their first contribution in #5456
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.2...livekit-agents@1.5.3
livekit-agents@1.5.2
Note
livekit-agents 1.5 introduced many new features. You can check out the changelog here.
What's Changed
- Update Phonic
generate_replytimeout to 10 seconds by @qionghuang6 in #5205 - fix: pass prometheus_multiproc_dir in from_server_options initialization by @ivanbalingit in #5195
- feat(mistralai): upgrade to SDK v2 by @Pauldevillers in #5163
- (deepgram sttv2): validate eager_eot_threshold value by @tinalenguyen in #5216
- Add WebSocket streaming support to Baseten TTS plugin by @iancarrasco-b10 in #4741
- fix: allow codec format specification via the user for Sarvam TTS by @pUrGe12 in #5209
- emit agent handoff from conversation_item_added by @tinalenguyen in #5218
- fix(llm): surface validation error details to LLM on function call argument failures by @Lyt060814 in #5193
- fix(cli): update log level width in console mode by @chenghao-mou in #5224
- fix(utils): preserve type annotations in deprecate_params by @longcw in #5200
- fix(test): replace oai with deepgram and fix broken tests by @chenghao-mou in #5225
- feat(voice): reuse STT connection across agent handoffs by @longcw in #5093
- feat(google): add VertexRAGRetrieval provider tool by @youpesh in #5222
- fix: ensure MCP client enter/exit run in the same task by @longcw in #5223
- feat(assemblyai): add domain parameter for Medical Mode by @m-ods in #5208
- fix: Nova Sonic interactive context bugs and dynamic tool support by @prettyprettyprettygood in #5220
- (google realtime): add gemini-3.1-flash-live-preview model by @tinalenguyen in #5233
- fix(utils): improve type annotation for deprecate_params decorator by @longcw in #5244
- fix: expose endpointing_opts in AgentSession.update_options() by @longcw in #5243
- Fix/stt fallback adapter propagate aligned transcript by @miladmnasr in #5237
- feat(mistral): add voxtral TTS support by @jeanprbt in #5245
- feat(anthropic): support strict tool use schema by @roshan-shaik-ml in #5259
- Baseten Plugin Update: fix metadata schema, add chain_id support, and improve response parsing by @jiegong-fde in #4889
- feat(upliftai): add support for phrase replacement config id by @zaidqureshi2 in #5261
- feat(soniox): expose max_endpoint_delay_ms option by @pstrav in #5214
- fix: prevent TTS retry after partial audio and replay input on retry by @longcw in #5242
- fix: only start session host when it's primary session by @longcw in #5241
- fix: prevent CancelledError from propagating to unrelated Tee peers by @longcw in #5273
- fix: prevent AttributeError in ThreadJobExecutor.logging_extra() by @longcw in #5277
- fix(openai): close current generation channels on realtime reconnect by @longcw in #5276
- fix(recorder): guard against empty agent speech frames by @chenghao-mou in #5279
- fix(stt): reset VAD when STT sends EOT by @chenghao-mou in #5095
- feat(anam): add avatarModel config support by @sr-anam in #5272
- fix: catch TimeoutError from drain() so aclose() always runs by @seglo in #5282
- (gemini-3.1-flash-live-preview): add warning for generate_reply by @tinalenguyen in #5286
- feat(mistralai): add ref_audio support to Voxtral TTS for zero-shot voice cloning by @EtienneLescot in #5278
- fix(core): reset user state to listening when audio is disabled by @chenghao-mou in #5198
- append generate_reply instructions as system msg and convert it to user msg if unsupported by @longcw in #5287
- add AsyncToolset by @longcw in #5127
- fix(core): fix BackgroundAudioPlayer.play() hanging indefinitely by @theomonnom in #5299
- fix(cli): prevent api_key/api_secret from leaking in tracebacks by @theomonnom in #5300
- (phonic) Update languages fields by @qionghuang6 in #5285
- fix(core): reduce TTS output buffering latency by @theomonnom in #5292
- add session_end_timeout and gracefully cancel entrypoint on shutdown by @theomonnom in #4580
- feat: OTEL metrics for latencies, usage, and connection timing by @theomonnom in #4891
- evals: custom judges, tag metadata, and OTEL improvements by @theomonnom in #5306
- fix is_context_type for generic RunContext types by @theomonnom in #5307
- add 7-day uv cooldown by @chenghao-mou in #5290
- fix(openai realtime): support per-response tool_choice in realtime sessions by @longcw in #5211
- use delta aggregation temporality for otel metrics by @paulwe in #5314
- (phonic) Add
min_words_to_interruptto Phonic plugin options by @qionghuang6 in #5304 - add tag field to evaluation OTEL log records by @theomonnom in #5315
- docs: add example agent replies to AsyncToolset by @longcw in #5313
- fix(cartesia): handle flush_done message in TTS _recv_task by @Panmax in #5321
- fix(voice): make function call history preservation configurable in AgentTask by @GopalGB in #5288
- fix: convert oneOf to anyOf in strict schema for discriminated unions by @longcw in #5324
- (gemini realtime): add warnings in update_chat_ctx and update_instructions by @tinalenguyen in #5332
- fix: wait_for_participant waits until participant is fully active by @davidzhao in #5271
- feat: answering machine detection by @chenghao-mou in #4906
- feat: expose service_tier in CompletionUsage from OpenAI Responses API by @piyush-gambhir in #5341
- fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds by @anunaym14 in #5339
- feat/sarvam-llm-openai-compatible-integration by @dhruvladia-sarvam in #5069
- feat(azure-stt): Possibility to change segmentation options during a call by @rafallezanko in #5323
- fix(sarvam): sync missing API params, fix value ranges, and update models by @Namit1867 in #5347
- (xai tts): update fields and ws setup by @tinalenguyen in #5350
- fix(smallestai): add lightning-v3.1 endpoint routing by @sg-siddhant in #5330
- feat(inference): add debug/identification headers to inference requests by @adrian-cowham in #5337
- Move community plugins to livekit-plugins/community/ by @theomonnom in #5250
- feat: support per-response tools in generate_reply by @longcw in #5310
- fix xAI realtime update chat ctx by @longcw in #5320
- Fix RoomIO teardown listener cleanup by @sindarknave in #5357
- feat(mistral): support voxtral realtime streaming stt & modernize mistral plugin by @jeanprbt in #5289
- fix: say() with missing audio file hangs forever and blocks speech queue by @theomonnom in #5358
- add prompt_cache_retention chat completion option to inference by @s-hamdananwar in #5370
- Add Murf as optional dep by @royalfig in #5334
- feat(core): Support multiple provider keys in extra_content serialization by @adrian-cowham in #5374
- ci: add PyPI publish workflow with trusted publishing by @theomonnom in #5379
- feat: Add D-ID avatar plugin by @osimhi213 in #5232
- ci: fix tag checkout and discover glob by @theomonnom in #5381
- feat(rime): add mistv3 model support by @mcullan in #5298
- ci: fix update_versions.py invocation by @theomonnom i...
livekit-agents@1.5.1
Note
livekit-agents 1.5 introduced many new features. You can check out the changelog here.
What's Changed
- fix azure openai realtime support & add realtime models tests by @theomonnom in #5168
- fix(core): version mismatch due to bad merge by @chenghao-mou in #5176
- fix(turn-detector): relax transformers upper bound to allow 5.x by @gdoermann in #5174
- (gladia & soniox): add translation support by @tinalenguyen in #5148
- feat(agents): support LIVEKIT_OBSERVABILITY_URL for custom observability endpoints by @theomonnom in #5179
- (xai tts): update websocket endpoint by @tinalenguyen in #5180
- fix(core): restore chat topic support in room IO by @chenghao-mou in #5181
- Unskip Tool Call Items before Summarization in Task Group by @toubatbrian in #5169
- add sdk_version to SessionReport for observability by @theomonnom in #5182
- feat(hamming): add hamming monitoring plugin package by @duchammingai in #5135
- chore(mypy): enable mypy cache in type checking by @chenghao-mou in #5192
- fix: expose Chirp 3 google STT endpoint sensitivity by @karlsonlee-livekit in #5196
- add MCPToolset by @longcw in #5138
- Feat/personaplex plugin by @milanperovic in #4660
- fix: skip redundant realtime events in OpenAI plugin by @theomonnom in #5204
- feat: enable AGC by default on RoomInput audio by @theomonnom in #5185
- bump minimum livekit sdk version to 1.1.3 by @theomonnom in #5206
- livekit-agents 1.5.1 by @theomonnom in #5207
New Contributors
- @duchammingai made their first contribution in #5135
- @karlsonlee-livekit made their first contribution in #5196
- @milanperovic made their first contribution in #4660
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.0...livekit-agents@1.5.1
livekit-agents@1.5.0
Highlights
Preemptive generation is now enabled by default
Preemptive generation starts LLM and TTS inference before the end of a userโs turn is detected, reducing overall latency.
To disable it:
session = AgentSession(preemptive_generation=False)Adaptive Interruption Handling
The headline feature of v1.5.0: an audio-based ML model that distinguishes genuine user interruptions from incidental sounds like backchannels ("mm-hmm"), coughs, sighs, or background noise. Enabled by default โ no configuration needed.
Key stats:
- 86% precision and 100% recall at 500ms overlapping speech
- Rejects 51% of traditional VAD false positives
- Detects true interruptions 64% faster than VAD alone
- Inference completes in 30ms or less
When a false interruption is detected, the agent automatically resumes playback from where it left off โ no re-generation needed.
To opt out and use VAD-only interruption:
session = AgentSession(
...
turn_handling=TurnHandlingOptions(
interruption={
"mode": "vad",
},
),
)Blog post: https://livekit.com/blog/adaptive-interruption-handling
Dynamic Endpointing
Endpointing delays now adapt to each conversation's natural rhythm. Instead of a fixed silence threshold, the agent uses an exponential moving average of pause durations to dynamically adjust when it considers the user's turn complete.
session = AgentSession(
...
turn_handling=TurnHandlingOptions(
endpointing={
"mode": "dynamic",
"min_delay": 0.3,
"max_delay": 3.0,
},
),
)New TurnHandlingOptions API
Endpointing and interruption settings are now consolidated into a single TurnHandlingOptions dict passed to AgentSession. Old keyword arguments (min_endpointing_delay, allow_interruptions, etc.) still work but are deprecated and will emit warnings.
session = AgentSession(
turn_handling={
"turn_detection": "vad",
"endpointing": {"min_delay": 0.5, "max_delay": 3.0},
"interruption": {"enabled": True, "mode": "adaptive"},
},
)Session Usage Tracking
New SessionUsageUpdatedEvent provides structured, per-model usage data โ token counts, character counts, and audio durations โ broken down by provider and model:
@session.on("session_usage_updated")
def on_usage(ev: SessionUsageUpdatedEvent):
for usage in ev.usage.model_usage:
print(f"{usage.provider}/{usage.model}: {usage}")Usage types: LLMModelUsage, TTSModelUsage, STTModelUsage, InterruptionModelUsage.
You can also access aggregated usage at any time via the session.usage property:
usage = session.usage
for model_usage in usage.model_usage:
print(model_usage)Usage data is also included in SessionReport (via model_usage), so it's available in post-session telemetry and reporting out of the box.
Per-Turn Latency on ChatMessage.metrics
Each ChatMessage now carries a metrics field (MetricsReport) with per-turn latency data:
transcription_delayโ time to obtain transcript after end of speechend_of_turn_delayโ time between end of speech and turn decisionon_user_turn_completed_delayโ time in the developer callback
Action-Aware Chat Context Summarization
Context summarization now includes function calls and their outputs when building summaries, preserving tool-use context across the conversation window.
Configurable Log Level
Set the agent log level via LIVEKIT_LOG_LEVEL environment variable or through ServerOptions, without touching your code.
Deprecations
| Deprecated | Replacement | Notes |
|---|---|---|
metrics_collected event |
session_usage_updated event + ChatMessage.metrics |
Usage/cost data moves to session_usage_updated; per-turn latency moves to ChatMessage.metrics. Old listeners still work with a deprecation warning. |
UsageCollector |
ModelUsageCollector |
New collector supports per-model/provider breakdown |
UsageSummary |
LLMModelUsage, TTSModelUsage, STTModelUsage |
Typed per-service usage classes |
RealtimeModelBeta |
RealtimeModel |
Beta API removed |
AgentFalseInterruptionEvent.message / .extra_instructions |
Automatic resume via adaptive interruption | Accessing these fields logs a deprecation warning |
AgentSession kwargs: min_endpointing_delay, max_endpointing_delay, allow_interruptions, discard_audio_if_uninterruptible, min_interruption_duration, min_interruption_words, turn_detection, false_interruption_timeout, resume_false_interruption |
turn_handling=TurnHandlingOptions(...) |
Old kwargs still work but emit deprecation warnings. Will be removed in v2.0. |
Agent / AgentTask kwargs: turn_detection, min_endpointing_delay, max_endpointing_delay, allow_interruptions |
turn_handling=TurnHandlingOptions(...) |
Same migration path as AgentSession. Will be removed in future versions. |
Complete changelog
- (xai): add grok text to speech api to readme by @tinalenguyen in #5125
- Remove Gemini 2.0 models from inference gateway types by @Shubhrakanti in #5133
- feat: support log level via ServerOptions and LIVEKIT_LOG_LEVEL env var by @onurburak9 in #5112
- fix: preserve 'type' field in TaskGroup JSON schema enum items by @weiguangli-io in #5073
- feat(assemblyai): expose session ID from Begin event by @dlange-aai in #5132
- fix: strip empty {} entries from anyOf/oneOf in strict JSON schema by @theomonnom in #5137
- fix: update_instructions() now reflected in tool call response generation by @weiguangli-io in #5072
- Make chat context summarization action-aware by @toubatbrian in #5099
- fix(realtime): sync remote items to local chat_ctx with placeholders to prevent in-flight deletion by @longcw in #5114
- Set _speech_start_time when VAD START_OF_SPEECH activates by @hudson-worden in #5027
- Fix(inworld): "Context not found" errors caused by invalid enum parameter types by @ianbbqzy in #5153
- increase generate_reply timeout & remove RealtimeModelBeta by @theomonnom in #5149
- add livekit-blockguard plugin by @theomonnom in #5023
- openai: add max_completion_tokens to with_azure() by @abhishekranjan-bluemachines in #5143
- Restrict mistralai dependency to use v1 sdk by @csanz91 in #5116
- feat(assemblyai): add DEBUG-level diagnostic logging by @dlange-aai in #5146
- Fix Phonic
generate_replyto resolve with the currentGenerationCreatedEventby @qionghuang6 in #5147 - fix(11labs): add empty keepalive message and remove final duplicates by @chenghao-mou in #5139
- AGT-2182: Add adaptive interruption handling and dynamic endpointing by @chenghao-mou in #4771
- livekit-agents 1.5.0 by @theomonnom in #5165
New Contributors
- @onurburak9 made their first contribution in #5112
- @weiguangli-io made their first contribution in #5073
- @abhishekranjan-bluemachines made their first contribution in #5143
- @csanz91 made their first contribution in #5116
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.6...livekit-agents@1.5.0
livekit-agents@1.4.6
What's Changed
- fix(types): replace TypeGuard with TypeIs in is_given for bidirectional narrowing by @longcw in #5079
- [inworld] websocket _recv_loop to flush the audio immediately by @ianbbqzy in #5071
- fix: include
nullin enum array for nullable enum schemas by @MSameerAbbas in #5080 - (openai chat completions): drop reasoning_effort when function tools are present by @tinalenguyen in #5088
- (google realtime): replace deprecated mediaChunks by @tinalenguyen in #5089
- fix: omit
requiredfield in tool schema when function has no parameters by @longcw in #5082 - fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav by @shmundada93 in #5086
- add trunk_config to WarmTransferTask for SIP endpoint transfers by @longcw in #5016
- healthcare example by @tinalenguyen in #5031
- fix(openai): only reuse previous_response_id when pending tool calls are completed by @longcw in #5094
- feat(assemblyai): add speaker diarization support by @dlange-aai in #5074
- fix: prevent _cancel_speech_pause from poisoning subsequent user turns by @giulio-leone in #5101
- feat(google): support universal credential types in STT and TTS credentials_file by @rafallezanko in #5056
- Add Murf AI - TTS Plugin Support by @gaurav-murf in #3000
- feat(voice): add callable TextTransforms support with built-in replace transform by @longcw in #5104
- fix(eou): only reset speech/speaking time when no new speech by @chenghao-mou in #5083
- (xai): add tts by @tinalenguyen in #5120
- (xai tts): add language parameter by @tinalenguyen in #5122
- livekit-agents 1.4.6 by @theomonnom in #5123
New Contributors
- @shmundada93 made their first contribution in #5086
- @dlange-aai made their first contribution in #5074
- @gaurav-murf made their first contribution in #3000
Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.5...livekit-agents@1.4.6