Releases · livekit/agents

05 May 19:18

tinalenguyen

livekit-agents@1.5.8

d28d1a1

livekit-agents@1.5.8 Latest

Latest

What's Changed

feat(interruption): barge-in cooldown window for corrections by @chenghao-mou in #5269
fix(amd): amd improvement (AGT-2777) by @chenghao-mou in #5584
fix(warm_transfer): don't fall back to env var when sip_connection is set by @longcw in #5619
fix(aws): wait for stream ready before sending audio start event by @lanazhang in #5626
Fix Missing user message metrics (MetricsReport) due to early returns in _user_turn_completed_task and no initialization in on_end_of_turn by @hudson-worden in #5437
fix(amd): missing stt start by @chenghao-mou in #5633
fix: reduce overly eager call ending behavior by @davidzhao in #5630
feat(fishaudio): use websocket API for faster inference by @davidzhao in #5629
fix(observability): retry session recording upload by @paulwe in #5627
fix(openai realtime): reject pending response future on error event by @longcw in #5576
feat(inference): propagate STT extra to SpeechData.metadata by @russellmartin-livekit in #5639
Update README.md by @theomonnom in #5640
fix(amd): reset timer for late stt transcript by @chenghao-mou in #5637
fix: end Runway realtime sessions on shutdown by @robinandeer in #5623
ci(examples): add deploy workflow by @tinalenguyen in #5641
feat(amd): add remote session event for amd AGT-2828 by @chenghao-mou in #5621
Add Soniox TTS plugin by @matejmarinko-soniox in #5543
(inworld tts): add new model by @tinalenguyen in #5646
livekit-agents@1.5.8 by @github-actions[bot] in #5647

New Contributors

@lanazhang made their first contribution in #5626

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.7...livekit-agents@1.5.8

Contributors

paulwe, davidzhao, and 9 other contributors

Assets 2

30 Apr 20:39

tinalenguyen

livekit-agents@1.5.7

392eb51

livekit-agents@1.5.7

What's Changed

fix(openai): forward session.update on RealtimeModel.update_options by @longcw in #5531
fix(transcription): seed _start_wall_time fallback in aclose by @longcw in #5532
Fix realtime reply generation after interruption by @jayeshp19 in #5526
fix(cartesia): Move API key from Query Params to Headers by @charlotte-zhuang in #5516
deepgram-stt: report connection-lifetime remainder so usage matches billing by @joaquinhuigomez in #5506
feat(room-io): add json_format option for timed transcription output by @longcw in #5472
feat(inference): add inference_class option to LLM for priority routing by @adrian-cowham in #5517
chore: update default model for Anthropic LLM by @royalfig in #5539
fix(voice): pause output when user starts speaking during thinking by @longcw in #5535
feat(openai): add gpt-5.4-mini to model registry by @xtreme-sameer-vohra in #5540
feat(assemblyai): warn when audio stops flowing to the WebSocket by @gsharp-aai in #5504
feat(tts): add support for timestamps in Inference by @chenghao-mou in #5534
docs: clarify RunResult.events testing surface by @Rul1an in #5525
feat(stt): back-date START_OF_SPEECH onset via server-provided timestamp by @gsharp-aai in #5479
feat(aws): add auto language detection and mid-stream language switch… by @cldsime in #5435
(release workflow): add docs job by @tinalenguyen in #5551
(liveavatar): add video_quality param by @tinalenguyen in #5552
Add avatartalk plugin to optional dependencies by @bcherry in #5550
fix(soniox): emit PREFLIGHT_TRANSCRIPT for preemptive LLM generation by @octo-patch in #5553
feat(xai): support model selection in realtime, default to grok-voice-think-fast-1.0 by @Hormold in #5548
Remove 'distil-whisper-large-v3-en' from STTModels by @vedevpatel in #5537
fix: don't swallow _ExitCli during shutdown by @lawrence3699 in #5519
feat: expose provider request ids on STT/TTS/LLM spans for debugging by @longcw in #5546
chore(openai): remove STT.with_groq constructor by @davidzhao in #5555
chore(deps): update github actions (major) by @renovate[bot] in #5558
feat(mcp): allow updating headers on MCPServerHTTP by @longcw in #5559
feat(metrics): add playback_latency metric by @longcw in #5524
feat(endpointing): expose dynamic endpointing alpha parameter (AGT-2764) by @chenghao-mou in #5491
fix(smallestai): use close_stream signal to properly terminate STT session by @harshitajain165 in #5562
Hotfix; Updated default Avatar ID by @hari-truviz in #5568
fix(gemini live): use parameters instead of parameters_json_schema for raw schema function tools by @longcw in #5560
Stuck aclose() activity leading to stuck handoff by @svacatalisan in #4649
fix(async_toolset): respect allow_interruptions when cancelling tool calls by @longcw in #5570
update livekit rtc to 1.1.7 by @davidzhao in #5572
feat(mistral): add connectors provider tool & fix realtime STT custom headers by @jeanprbt in #5575
feat(openai): expose verbosity in Responses LLM by @AlessandroElyos in #5583
fix(mistral): use conversations API statelessly by @TheCodingCvrlo in #5586
support LIVEKIT_AGENT_NAME env var by @theomonnom in #5571
fix(recorder): use libopus when possible by @chenghao-mou in #5579
docs: add LIVEKIT_AGENT_NAME to environment variables by @detail-app[bot] in #5599
fix(elevenlabs): use audio_format query param for STT realtime by @longcw in #5574
fix: clear stale paused speech state across generation steps by @longcw in #5594
fix: cancel Runway realtime sessions on shutdown by @robinandeer in #5612
fix(inference): skip unknown message warning and rename event name by @chenghao-mou in #5614
feat: add SLNG plugin for STT and TTS by @metehan-slng in #5249
livekit-agents@1.5.7 by @github-actions[bot] in #5615

New Contributors

@charlotte-zhuang made their first contribution in #5516
@xtreme-sameer-vohra made their first contribution in #5540
@Rul1an made their first contribution in #5525
@cldsime made their first contribution in #5435
@octo-patch made their first contribution in #5553
@vedevpatel made their first contribution in #5537
@lawrence3699 made their first contribution in #5519
@svacatalisan made their first contribution in #4649
@AlessandroElyos made their first contribution in #5583
@TheCodingCvrlo made their first contribution in #5586
@detail-app[bot] made their first contribution in #5599
@metehan-slng made their first contribution in #5249

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.6...livekit-agents@1.5.7

Contributors

bcherry, Hormold, and 26 other contributors

Assets 2

22 Apr 20:26

tinalenguyen

livekit-agents@1.5.6

25bd9c7

livekit-agents@1.5.6

What's Changed

Add Qwen 3 TTS support for Simplismart-livekit plugin by @simplipratik in #5474
Add Inworld STT provider to livekit-plugins-inworld by @cshape in #5451
(minimax): add new TTS models by @tinalenguyen in #5518
feat(smallestai): add Pulse STT with real-time streaming and batch transcription by @harshitajain165 in #5312
feat(avatar): add playback_started RPC for remote avatar workers by @longcw in #5511
fix: clear _hist buffer in MovingAverage.reset() to prevent stale averages by @kuishou68 in #5522
feat(mistral): migrate LLM to Conversations API with provider tools support by @jeanprbt in #5527
livekit-agents@1.5.6 by @github-actions[bot] in #5528

New Contributors

@simplipratik made their first contribution in #5474
@harshitajain165 made their first contribution in #5312
@kuishou68 made their first contribution in #5522

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.5...livekit-agents@1.5.6

Contributors

longcw, cshape, and 5 other contributors

Assets 2

20 Apr 22:39

tinalenguyen

livekit-agents@1.5.5

2bb75be

livekit-agents@1.5.5

What's Changed

feat(inference): STT diarization capabilities and speaker_id on TimedString, add xAI TTS support for inference by @russellmartin-livekit in #5438
[inworld] timed_string to no longer have trailing spaces by @ianbbqzy in #5470
fix(examples): update e2ee.py to use encryption kwarg and env var by @aryeila in #5469
chore(deps): update dependency pillow to v12.2.0 [security] by @renovate[bot] in #5440
fix(tests): update preemptive_generation mock to use dict by @longcw in #5468
fix(telemetry): bound OTel provider shutdown to avoid watchdog kills by @theomonnom in #5471
feat(assemblyai): log connection lifecycle, silence, and session correlators by @dlange-aai in #5476
fix: strip markdown emphasis adjacent to punctuation by @carschandler in #5481
(aws realtime): add expiry check for cached credentials by @tinalenguyen in #5485
(hedra): note deprecation in readme by @tinalenguyen in #5475
(deepgram sttv2): add flux-general-multi support by @tinalenguyen in #5486
(xai stt): expose endpointing param to user by @tinalenguyen in #5493
fix(room-io): ownership-aware FrameProcessor lifecycle management by @longcw in #5467
(openai responses): drop prompt_cache_retention in received responses by @tinalenguyen in #5502
feat(avatar): add AvatarSession base class, warn on sync mis-wire by @longcw in #5499
livekit-agents@1.5.5 by @github-actions[bot] in #5503

New Contributors

@carschandler made their first contribution in #5481

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.4...livekit-agents@1.5.5

Contributors

renovate, longcw, and 7 other contributors

Assets 2

16 Apr 04:57

theomonnom

livekit-agents@1.5.4

78a66bc

livekit-agents@1.5.4

New features

Preemptive generation: added more granular options

Refines default behavior for preemptive generation to better handle long or intermittent user speech, reducing unnecessary downstream inference and associated cost increases.

Also introduces PreemptiveGenerationOptions for developers who need fine-grained control over this behavior.

https://github.com/livekit/agents/blob/78a66bcf79c5cea82989401c408f1dff4b961a5b/livekit-agents/livekit/agents/voice/turn.py#L115

class PreemptiveGenerationOptions(TypedDict, total=False):
    """Configuration for preemptive generation."""

    enabled: bool
    """Whether preemptive generation is enabled. Defaults to ``True``."""

    preemptive_tts: bool
    """Whether to also run TTS preemptively before the turn is confirmed.
    When ``False`` (default), only LLM runs preemptively; TTS starts once the
    turn is confirmed and the speech is scheduled."""

    max_speech_duration: float
    """Maximum user speech duration (s) for which preemptive generation
    is attempted. Beyond this threshold, preemptive generation is skipped
    since long utterances are more likely to change and users may expect
    slower responses. Defaults to ``10.0``."""

    max_retries: int
    """Maximum number of preemptive generation attempts per user turn.
    The counter resets when the turn completes. Defaults to ``3``."""

What's Changed

fix(voice): add PreemptiveGenerationOptions for fine-grained control by @longcw in #5428

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.3...livekit-agents@1.5.4

Contributors

longcw

Assets 2

16 Apr 00:02

theomonnom

livekit-agents@1.5.3

5bb7d2c

livekit-agents@1.5.3

Note

livekit-agents 1.5 introduced many new features. You can check out the changelog here.

What's Changed

feat(amd): add OTEL span and tag for AMD by @chenghao-mou in #5376
fix(openai): prepend session instructions in realtime generate_reply by @longcw in #5394
fix: AgentTask deadlock when on_enter awaits generate_reply that triggers another AgentTask by @longcw in #5377
telemetry: emit OTel span events for developer-role messages by @joaquinhuigomez in #5403
feat(realtime): reuse realtime session across agent handoffs if supported by @longcw in #5229
fix(llm): handle double-encoded JSON tool arguments from providers by @prettyprettyprettygood in #5409
chore: exposed the max session duration in with_azure() function by @k-1208 in #5383
(gemini live 3.1): fix tool responses by @tinalenguyen in #5413
fix(google): capture usage_metadata before early continues in streaming by @Panmax in #5404
fix(voice): block new user turns immediately on update_agent() to prevent transition delay by @Panmax in #5396
fix(tests): update drive thru instructions by @chenghao-mou in #5405
feat(inference): handle preflight_transcript in inference STT plugin by @adrian-cowham in #5412
fix(aws): unwrap doubly-encoded JSON tool arguments from Nova Sonic by @rililinx in #5411
chore: pin GHA by commit by @davidzhao in #5415
chore(deps): update dependency langchain-core to v1.2.28 [security] by @renovate[bot] in #5417
chore(deps): update dependency aiohttp to v3.13.4 [security] by @renovate[bot] in #5416
chore(deps): update dependency nltk to v3.9.4 [security] by @renovate[bot] in #5418
(azure openai): ensure gpt-realtime-1.5 compatibility by @tinalenguyen in #5407
chore(deps): update github workflows (major) by @renovate[bot] in #5424
update: Sarvam STT - add verbose error loggin and remove retry connection by @dhruvladia-sarvam in #5373
fix(inworld): do not leak connections when when cancelled by @davidzhao in #5427
feat: add service_tier parameter to Responses API LLM by @piyush-gambhir in #5346
Feature/krisp viva sdk support by @realgarik in #4370
fix: empty transcript blocks commit_user_turn until timeout by @longcw in #5429
fix: allow multiple AsyncToolsets by deduplicating management tools by @longcw in #5369
feat(beta/workflows): add InstructionParts for modular instruction customization by @longcw in #5077
add ToolSearchToolset and ToolProxyToolset for dynamic tool discovery by @longcw in #5140
Feature - Configurable session close transcript timeout by @bml1g12 in #5328
Fix FrameProcessor lifecycle for selector based noise cancellation by @Topherhindman in #5433
feat: add Runway Characters avatar plugin by @robinandeer in #5355
Rename e2ee to encryption in JobContext.connect by @longcw in #5454
chore: reduce renovate noise by @davidzhao in #5421
fix(liveavatar): wait for connected state and chunk audio before sending by @dyi1 in #5453
(phonic): support realtimemodel say() by @tinalenguyen in #5293
feat: add Cerebras LLM plugin by @u9g in #5456
(google tts): add "gemini-3.1-flash-tts-preview" model by @tinalenguyen in #5459
(hedra) remove from examples and raise exception by @tinalenguyen in #5460
xai stt by @tinalenguyen in #5458
feat(openai): expose max_output_tokens on Responses API LLM by @piyush-gambhir in #5449
(xai stt): pass diarization capability + minor fix by @tinalenguyen in #5461
Adding xAI Grok llm support for inference by @russellmartin-livekit in #5201
(release workflow): allow spacing in dependency by @tinalenguyen in #5463
livekit-agents@1.5.3 by @github-actions[bot] in #5464

New Contributors

@joaquinhuigomez made their first contribution in #5403
@k-1208 made their first contribution in #5383
@rililinx made their first contribution in #5411
@renovate[bot] made their first contribution in #5417
@realgarik made their first contribution in #4370
@robinandeer made their first contribution in #5355
@dyi1 made their first contribution in #5453
@u9g made their first contribution in #5456

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.2...livekit-agents@1.5.3

Contributors

davidzhao, robinandeer, and 18 other contributors

Assets 2

08 Apr 22:15

theomonnom

livekit-agents@1.5.2

38f1d69

livekit-agents@1.5.2

Note

livekit-agents 1.5 introduced many new features. You can check out the changelog here.

What's Changed

Update Phonic generate_reply timeout to 10 seconds by @qionghuang6 in #5205
fix: pass prometheus_multiproc_dir in from_server_options initialization by @ivanbalingit in #5195
feat(mistralai): upgrade to SDK v2 by @Pauldevillers in #5163
(deepgram sttv2): validate eager_eot_threshold value by @tinalenguyen in #5216
Add WebSocket streaming support to Baseten TTS plugin by @iancarrasco-b10 in #4741
fix: allow codec format specification via the user for Sarvam TTS by @pUrGe12 in #5209
emit agent handoff from conversation_item_added by @tinalenguyen in #5218
fix(llm): surface validation error details to LLM on function call argument failures by @Lyt060814 in #5193
fix(cli): update log level width in console mode by @chenghao-mou in #5224
fix(utils): preserve type annotations in deprecate_params by @longcw in #5200
fix(test): replace oai with deepgram and fix broken tests by @chenghao-mou in #5225
feat(voice): reuse STT connection across agent handoffs by @longcw in #5093
feat(google): add VertexRAGRetrieval provider tool by @youpesh in #5222
fix: ensure MCP client enter/exit run in the same task by @longcw in #5223
feat(assemblyai): add domain parameter for Medical Mode by @m-ods in #5208
fix: Nova Sonic interactive context bugs and dynamic tool support by @prettyprettyprettygood in #5220
(google realtime): add gemini-3.1-flash-live-preview model by @tinalenguyen in #5233
fix(utils): improve type annotation for deprecate_params decorator by @longcw in #5244
fix: expose endpointing_opts in AgentSession.update_options() by @longcw in #5243
Fix/stt fallback adapter propagate aligned transcript by @miladmnasr in #5237
feat(mistral): add voxtral TTS support by @jeanprbt in #5245
feat(anthropic): support strict tool use schema by @roshan-shaik-ml in #5259
Baseten Plugin Update: fix metadata schema, add chain_id support, and improve response parsing by @jiegong-fde in #4889
feat(upliftai): add support for phrase replacement config id by @zaidqureshi2 in #5261
feat(soniox): expose max_endpoint_delay_ms option by @pstrav in #5214
fix: prevent TTS retry after partial audio and replay input on retry by @longcw in #5242
fix: only start session host when it's primary session by @longcw in #5241
fix: prevent CancelledError from propagating to unrelated Tee peers by @longcw in #5273
fix: prevent AttributeError in ThreadJobExecutor.logging_extra() by @longcw in #5277
fix(openai): close current generation channels on realtime reconnect by @longcw in #5276
fix(recorder): guard against empty agent speech frames by @chenghao-mou in #5279
fix(stt): reset VAD when STT sends EOT by @chenghao-mou in #5095
feat(anam): add avatarModel config support by @sr-anam in #5272
fix: catch TimeoutError from drain() so aclose() always runs by @seglo in #5282
(gemini-3.1-flash-live-preview): add warning for generate_reply by @tinalenguyen in #5286
feat(mistralai): add ref_audio support to Voxtral TTS for zero-shot voice cloning by @EtienneLescot in #5278
fix(core): reset user state to listening when audio is disabled by @chenghao-mou in #5198
append generate_reply instructions as system msg and convert it to user msg if unsupported by @longcw in #5287
add AsyncToolset by @longcw in #5127
fix(core): fix BackgroundAudioPlayer.play() hanging indefinitely by @theomonnom in #5299
fix(cli): prevent api_key/api_secret from leaking in tracebacks by @theomonnom in #5300
(phonic) Update languages fields by @qionghuang6 in #5285
fix(core): reduce TTS output buffering latency by @theomonnom in #5292
add session_end_timeout and gracefully cancel entrypoint on shutdown by @theomonnom in #4580
feat: OTEL metrics for latencies, usage, and connection timing by @theomonnom in #4891
evals: custom judges, tag metadata, and OTEL improvements by @theomonnom in #5306
fix is_context_type for generic RunContext types by @theomonnom in #5307
add 7-day uv cooldown by @chenghao-mou in #5290
fix(openai realtime): support per-response tool_choice in realtime sessions by @longcw in #5211
use delta aggregation temporality for otel metrics by @paulwe in #5314
(phonic) Add min_words_to_interrupt to Phonic plugin options by @qionghuang6 in #5304
add tag field to evaluation OTEL log records by @theomonnom in #5315
docs: add example agent replies to AsyncToolset by @longcw in #5313
fix(cartesia): handle flush_done message in TTS _recv_task by @Panmax in #5321
fix(voice): make function call history preservation configurable in AgentTask by @GopalGB in #5288
fix: convert oneOf to anyOf in strict schema for discriminated unions by @longcw in #5324
(gemini realtime): add warnings in update_chat_ctx and update_instructions by @tinalenguyen in #5332
fix: wait_for_participant waits until participant is fully active by @davidzhao in #5271
feat: answering machine detection by @chenghao-mou in #4906
feat: expose service_tier in CompletionUsage from OpenAI Responses API by @piyush-gambhir in #5341
fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds by @anunaym14 in #5339
feat/sarvam-llm-openai-compatible-integration by @dhruvladia-sarvam in #5069
feat(azure-stt): Possibility to change segmentation options during a call by @rafallezanko in #5323
fix(sarvam): sync missing API params, fix value ranges, and update models by @Namit1867 in #5347
(xai tts): update fields and ws setup by @tinalenguyen in #5350
fix(smallestai): add lightning-v3.1 endpoint routing by @sg-siddhant in #5330
feat(inference): add debug/identification headers to inference requests by @adrian-cowham in #5337
Move community plugins to livekit-plugins/community/ by @theomonnom in #5250
feat: support per-response tools in generate_reply by @longcw in #5310
fix xAI realtime update chat ctx by @longcw in #5320
Fix RoomIO teardown listener cleanup by @sindarknave in #5357
feat(mistral): support voxtral realtime streaming stt & modernize mistral plugin by @jeanprbt in #5289
fix: say() with missing audio file hangs forever and blocks speech queue by @theomonnom in #5358
add prompt_cache_retention chat completion option to inference by @s-hamdananwar in #5370
Add Murf as optional dep by @royalfig in #5334
feat(core): Support multiple provider keys in extra_content serialization by @adrian-cowham in #5374
ci: add PyPI publish workflow with trusted publishing by @theomonnom in #5379
feat: Add D-ID avatar plugin by @osimhi213 in #5232
ci: fix tag checkout and discover glob by @theomonnom in #5381
feat(rime): add mistv3 model support by @mcullan in #5298
ci: fix update_versions.py invocation by @theomonnom i...

Contributors

paulwe, davidzhao, and 36 other contributors

Assets 2

23 Mar 22:52

theomonnom

livekit-agents@1.5.1

d167306

livekit-agents@1.5.1

Note

livekit-agents 1.5 introduced many new features. You can check out the changelog here.

What's Changed

fix azure openai realtime support & add realtime models tests by @theomonnom in #5168
fix(core): version mismatch due to bad merge by @chenghao-mou in #5176
fix(turn-detector): relax transformers upper bound to allow 5.x by @gdoermann in #5174
(gladia & soniox): add translation support by @tinalenguyen in #5148
feat(agents): support LIVEKIT_OBSERVABILITY_URL for custom observability endpoints by @theomonnom in #5179
(xai tts): update websocket endpoint by @tinalenguyen in #5180
fix(core): restore chat topic support in room IO by @chenghao-mou in #5181
Unskip Tool Call Items before Summarization in Task Group by @toubatbrian in #5169
add sdk_version to SessionReport for observability by @theomonnom in #5182
feat(hamming): add hamming monitoring plugin package by @duchammingai in #5135
chore(mypy): enable mypy cache in type checking by @chenghao-mou in #5192
fix: expose Chirp 3 google STT endpoint sensitivity by @karlsonlee-livekit in #5196
add MCPToolset by @longcw in #5138
Feat/personaplex plugin by @milanperovic in #4660
fix: skip redundant realtime events in OpenAI plugin by @theomonnom in #5204
feat: enable AGC by default on RoomInput audio by @theomonnom in #5185
bump minimum livekit sdk version to 1.1.3 by @theomonnom in #5206
livekit-agents 1.5.1 by @theomonnom in #5207

New Contributors

@duchammingai made their first contribution in #5135
@karlsonlee-livekit made their first contribution in #5196
@milanperovic made their first contribution in #4660

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.0...livekit-agents@1.5.1

Contributors

gdoermann, longcw, and 7 other contributors

Assets 2

19 Mar 17:01

theomonnom

livekit-agents@1.5.0

760504e

livekit-agents@1.5.0

Highlights

Preemptive generation is now enabled by default

Preemptive generation starts LLM and TTS inference before the end of a user’s turn is detected, reducing overall latency.

To disable it:

session = AgentSession(preemptive_generation=False)

Adaptive Interruption Handling

The headline feature of v1.5.0: an audio-based ML model that distinguishes genuine user interruptions from incidental sounds like backchannels ("mm-hmm"), coughs, sighs, or background noise. Enabled by default — no configuration needed.

Key stats:

86% precision and 100% recall at 500ms overlapping speech
Rejects 51% of traditional VAD false positives
Detects true interruptions 64% faster than VAD alone
Inference completes in 30ms or less

When a false interruption is detected, the agent automatically resumes playback from where it left off — no re-generation needed.

To opt out and use VAD-only interruption:

session = AgentSession(
    ...
    turn_handling=TurnHandlingOptions(
        interruption={
            "mode": "vad",
        },
    ),
)

Blog post: https://livekit.com/blog/adaptive-interruption-handling

Dynamic Endpointing

Endpointing delays now adapt to each conversation's natural rhythm. Instead of a fixed silence threshold, the agent uses an exponential moving average of pause durations to dynamically adjust when it considers the user's turn complete.

session = AgentSession(
    ...
    turn_handling=TurnHandlingOptions(
        endpointing={
            "mode": "dynamic",
            "min_delay": 0.3,
            "max_delay": 3.0,
        },
    ),
)

New `TurnHandlingOptions` API

Endpointing and interruption settings are now consolidated into a single TurnHandlingOptions dict passed to AgentSession. Old keyword arguments (min_endpointing_delay, allow_interruptions, etc.) still work but are deprecated and will emit warnings.

session = AgentSession(
    turn_handling={
        "turn_detection": "vad",
        "endpointing": {"min_delay": 0.5, "max_delay": 3.0},
        "interruption": {"enabled": True, "mode": "adaptive"},
    },
)

Session Usage Tracking

New SessionUsageUpdatedEvent provides structured, per-model usage data — token counts, character counts, and audio durations — broken down by provider and model:

@session.on("session_usage_updated")
def on_usage(ev: SessionUsageUpdatedEvent):
    for usage in ev.usage.model_usage:
        print(f"{usage.provider}/{usage.model}: {usage}")

Usage types: LLMModelUsage, TTSModelUsage, STTModelUsage, InterruptionModelUsage.

You can also access aggregated usage at any time via the session.usage property:

usage = session.usage
for model_usage in usage.model_usage:
    print(model_usage)

Usage data is also included in SessionReport (via model_usage), so it's available in post-session telemetry and reporting out of the box.

Per-Turn Latency on `ChatMessage.metrics`

Each ChatMessage now carries a metrics field (MetricsReport) with per-turn latency data:

transcription_delay — time to obtain transcript after end of speech
end_of_turn_delay — time between end of speech and turn decision
on_user_turn_completed_delay — time in the developer callback

Action-Aware Chat Context Summarization

Context summarization now includes function calls and their outputs when building summaries, preserving tool-use context across the conversation window.

Configurable Log Level

Set the agent log level via LIVEKIT_LOG_LEVEL environment variable or through ServerOptions, without touching your code.

Deprecations

Deprecated	Replacement	Notes
`metrics_collected` event	`session_usage_updated` event + `ChatMessage.metrics`	Usage/cost data moves to `session_usage_updated`; per-turn latency moves to `ChatMessage.metrics`. Old listeners still work with a deprecation warning.
`UsageCollector`	`ModelUsageCollector`	New collector supports per-model/provider breakdown
`UsageSummary`	`LLMModelUsage`, `TTSModelUsage`, `STTModelUsage`	Typed per-service usage classes
`RealtimeModelBeta`	`RealtimeModel`	Beta API removed
`AgentFalseInterruptionEvent.message` / `.extra_instructions`	Automatic resume via adaptive interruption	Accessing these fields logs a deprecation warning
`AgentSession` kwargs: `min_endpointing_delay`, `max_endpointing_delay`, `allow_interruptions`, `discard_audio_if_uninterruptible`, `min_interruption_duration`, `min_interruption_words`, `turn_detection`, `false_interruption_timeout`, `resume_false_interruption`	`turn_handling=TurnHandlingOptions(...)`	Old kwargs still work but emit deprecation warnings. Will be removed in v2.0.
`Agent` / `AgentTask` kwargs: `turn_detection`, `min_endpointing_delay`, `max_endpointing_delay`, `allow_interruptions`	`turn_handling=TurnHandlingOptions(...)`	Same migration path as `AgentSession`. Will be removed in future versions.

Complete changelog

(xai): add grok text to speech api to readme by @tinalenguyen in #5125
Remove Gemini 2.0 models from inference gateway types by @Shubhrakanti in #5133
feat: support log level via ServerOptions and LIVEKIT_LOG_LEVEL env var by @onurburak9 in #5112
fix: preserve 'type' field in TaskGroup JSON schema enum items by @weiguangli-io in #5073
feat(assemblyai): expose session ID from Begin event by @dlange-aai in #5132
fix: strip empty {} entries from anyOf/oneOf in strict JSON schema by @theomonnom in #5137
fix: update_instructions() now reflected in tool call response generation by @weiguangli-io in #5072
Make chat context summarization action-aware by @toubatbrian in #5099
fix(realtime): sync remote items to local chat_ctx with placeholders to prevent in-flight deletion by @longcw in #5114
Set _speech_start_time when VAD START_OF_SPEECH activates by @hudson-worden in #5027
Fix(inworld): "Context not found" errors caused by invalid enum parameter types by @ianbbqzy in #5153
increase generate_reply timeout & remove RealtimeModelBeta by @theomonnom in #5149
add livekit-blockguard plugin by @theomonnom in #5023
openai: add max_completion_tokens to with_azure() by @abhishekranjan-bluemachines in #5143
Restrict mistralai dependency to use v1 sdk by @csanz91 in #5116
feat(assemblyai): add DEBUG-level diagnostic logging by @dlange-aai in #5146
Fix Phonic generate_reply to resolve with the current GenerationCreatedEvent by @qionghuang6 in #5147
fix(11labs): add empty keepalive message and remove final duplicates by @chenghao-mou in #5139
AGT-2182: Add adaptive interruption handling and dynamic endpointing by @chenghao-mou in #4771
livekit-agents 1.5.0 by @theomonnom in #5165

New Contributors

@onurburak9 made their first contribution in #5112
@weiguangli-io made their first contribution in #5073
@abhishekranjan-bluemachines made their first contribution in #5143
@csanz91 made their first contribution in #5116

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.6...livekit-agents@1.5.0

Contributors

onurburak9, longcw, and 12 other contributors

Assets 2

16 Mar 19:09

theomonnom

livekit-agents@1.4.6

29b71d4

livekit-agents@1.4.6

What's Changed

fix(types): replace TypeGuard with TypeIs in is_given for bidirectional narrowing by @longcw in #5079
[inworld] websocket _recv_loop to flush the audio immediately by @ianbbqzy in #5071
fix: include null in enum array for nullable enum schemas by @MSameerAbbas in #5080
(openai chat completions): drop reasoning_effort when function tools are present by @tinalenguyen in #5088
(google realtime): replace deprecated mediaChunks by @tinalenguyen in #5089
fix: omit required field in tool schema when function has no parameters by @longcw in #5082
fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav by @shmundada93 in #5086
add trunk_config to WarmTransferTask for SIP endpoint transfers by @longcw in #5016
healthcare example by @tinalenguyen in #5031
fix(openai): only reuse previous_response_id when pending tool calls are completed by @longcw in #5094
feat(assemblyai): add speaker diarization support by @dlange-aai in #5074
fix: prevent _cancel_speech_pause from poisoning subsequent user turns by @giulio-leone in #5101
feat(google): support universal credential types in STT and TTS credentials_file by @rafallezanko in #5056
Add Murf AI - TTS Plugin Support by @gaurav-murf in #3000
feat(voice): add callable TextTransforms support with built-in replace transform by @longcw in #5104
fix(eou): only reset speech/speaking time when no new speech by @chenghao-mou in #5083
(xai): add tts by @tinalenguyen in #5120
(xai tts): add language parameter by @tinalenguyen in #5122
livekit-agents 1.4.6 by @theomonnom in #5123

New Contributors

@shmundada93 made their first contribution in #5086
@dlange-aai made their first contribution in #5074
@gaurav-murf made their first contribution in #3000

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.5...livekit-agents@1.4.6

Contributors

rafallezanko, longcw, and 9 other contributors

Assets 2

Releases: livekit/agents

livekit-agents@1.5.8

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.7

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.6

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.5

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.4

New features

Preemptive generation: added more granular options

What's Changed

Contributors

Uh oh!

livekit-agents@1.5.3

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.2

What's Changed

Contributors

Uh oh!

livekit-agents@1.5.1

What's Changed

New Contributors

Contributors

Uh oh!

livekit-agents@1.5.0

Highlights

Preemptive generation is now enabled by default

Adaptive Interruption Handling

Dynamic Endpointing

New TurnHandlingOptions API

Session Usage Tracking

Per-Turn Latency on ChatMessage.metrics

Action-Aware Chat Context Summarization

Configurable Log Level

Deprecations

Complete changelog

New Contributors

Contributors

Uh oh!

livekit-agents@1.4.6

What's Changed

New Contributors

Contributors

Uh oh!

New `TurnHandlingOptions` API

Per-Turn Latency on `ChatMessage.metrics`