feat(instrumentation): Add litellm package instrumentation by nina-kollman · Pull Request #4319 · traceloop/openllmetry

nina-kollman · 2026-06-25T09:47:12Z

Fixes: TLP-2379

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Summary by CodeRabbit

New Features
- Added LiteLLM observability support with spans, events, metrics, and streaming coverage for chat and embedding calls.
- Added provider-name normalization and richer prompt/response details, including tool-call handling and async support.
- Added LiteLLM examples for synchronous and async tool-calling workflows.
Bug Fixes
- Improved span finalization for streaming flows so traces close correctly on early exits, errors, and cleanup.
Documentation
- Added setup and usage guidance for the new LiteLLM instrumentation.

coderabbitai · 2026-06-25T09:47:21Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Introduces a new opentelemetry-instrumentation-litellm package that wraps LiteLLM completion, acompletion, embedding, and aembedding to emit gen_ai.* spans, metrics, and OTel events. Adds streaming proxy classes for exactly-once span finalization, wires LiteLLMInstrumentor into the Traceloop SDK, and includes tests plus sample tool-calling apps.

LiteLLM OpenTelemetry instrumentation

Layer / File(s)	Summary
Contracts and feature flags `opentelemetry/instrumentation/litellm/config.py`, `.../event_models.py`, `.../utils.py`, `.../version.py`	`Config` class, `MessageEvent`/`ChoiceEvent`/`ToolCall` TypedDicts, `should_send_prompts`, `is_metrics_enabled`, `should_emit_events`, `dont_throw`, and `__version__` are defined as shared foundation types used throughout the instrumentation.
Request/response attribute capture and event emission `opentelemetry/instrumentation/litellm/__init__.py` (lines 1–515), `.../event_emitter.py`	Declares `WRAPPED_METHODS`, resolves provider names from model prefixes or `custom_llm_provider`, records request/response span attributes (model, streaming, tokens, finish reasons, serialized messages with tool calls), emits structured OTel `LogRecord` events, and builds histogram/counter metrics instruments.
Streaming accumulation and finalization `opentelemetry/instrumentation/litellm/__init__.py` (lines 520–837)	Accumulates per-chunk deltas into indexed choice/tool-call state, reconstructs final `ModelResponse`, and wraps sync/async streams with `_StreamWrapper`/`_AsyncStreamWrapper` to guarantee exactly-once span finalization on normal exhaustion, early break, consumer error, `close`/`aclose`, and `__del__`.
LiteLLMInstrumentor, SDK wiring, and package config `opentelemetry/instrumentation/litellm/__init__.py` (lines 839–889), `traceloop/sdk/tracing/tracing.py`, `traceloop/sdk/instruments.py`, `pyproject.toml`, `project.json`, `README.md`	`LiteLLMInstrumentor` configures `Config`, creates tracer/meter/event-logger, and wraps/unwraps methods. SDK gains `Instruments.LITELLM`, `init_litellm_instrumentor`, and a span-attribute unfreeze patch for postprocess callbacks. Package metadata, Nx targets, and docs are also added.
Baseline tracing, event, and metrics tests `tests/conftest.py`, `tests/test_completion.py`, `tests/test_embeddings.py`, `tests/test_events.py`, `tests/test_provider_resolution.py`, `tests/test_custom_provider.py`	In-memory OTel fixture setup; covers sync/async completion and embedding spans, serialized input/output messages with tool calls, finish-reason normalization and deduplication, metrics emission, prompt-content suppression, OTel choice events with tool calls, provider-prefix resolution, and custom-provider suppression key verification.
Streaming and nesting tests `tests/test_streaming.py`, `tests/test_nesting.py`	Verifies streaming span lifecycle (normal, early break, consumer error, async), `CustomStreamWrapper` preservation, nested re-entrant LiteLLM call suppression (single span), and coexistence with OpenAI instrumentation.
Sample tool-calling workflows `packages/sample-app/sample_app/litellm_completion.py`, `.../litellm_async_streaming.py`	Synchronous and async agentic loop examples using LiteLLM tool calling with Traceloop initialization, local tool dispatch, and streamed final answers.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

traceloop/openllmetry#4133: Threads the SDK-level use_attributes/use_legacy_attributes setting into the LiteLLM instrumentor's use_legacy_attributes config, which is the same flag wired by this PR.

Suggested reviewers

doronkopit5
max-deygin-traceloop

🐇 A brand new package hops into the fold,
Wrapping LiteLLM in spans of gold.
Streaming chunks accumulate with care,
Tool calls and finish reasons laid bare.
From litellm.chat to metrics galore—
This bunny traces every LLM score! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.39% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding LiteLLM package instrumentation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch nk/litellm_instrumentation

_{Comment @coderabbitai help to get the list of available commands.}

pyproject declares `readme = "README.md"`, but the file was absent, so the hatchling editable build failed with `OSError: Readme file does not exist`, breaking the Lint, Build, and Test CI jobs at the install step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py (1)

72-87: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make uninstrument cleanup robust to partial setup failures.

At Line 74-76, both .instrument(...) calls happen before the try. If Line 75 raises, Line 86 won’t run for the already-instrumented OpenAI instrumentor. Move instrumentation inside try (or track success flags) to guarantee teardown.

Suggested fix

     openai_instrumentor = OpenAIInstrumentor()
     litellm_instrumentor = LiteLLMInstrumentor()
-    openai_instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider)
-    litellm_instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider)
     try:
+        openai_instrumentor.instrument(
+            tracer_provider=tracer_provider, meter_provider=meter_provider
+        )
+        litellm_instrumentor.instrument(
+            tracer_provider=tracer_provider, meter_provider=meter_provider
+        )
         litellm.completion(
             model="gpt-3.5-turbo",
             messages=MESSAGES,
             mock_response="hello",
         )
         spans = span_exporter.get_finished_spans()
         assert [span.name for span in spans] == ["litellm.chat"]
     finally:
         litellm_instrumentor.uninstrument()
         openai_instrumentor.uninstrument()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py` around
lines 72 - 87, The test cleanup in test_nesting.py is not robust if one
instrument() call fails after the other has already succeeded. Move the
OpenAIInstrumentor.instrument and LiteLLMInstrumentor.instrument setup inside
the try block in test_nesting, or track whether each instrumentor was
successfully enabled before calling uninstrument in finally, so partial setup
failures still guarantee teardown.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`:
- Around line 696-705: Make LiteLLMInstrumentor.__init__() side-effect free: it
currently mutates process-wide Config state (exception_logger,
use_legacy_attributes, and get_common_metrics_attributes) as soon as the
instrumentor is constructed. Move those Config assignments out of __init__ and
into the actual instrumentation path (or guard them so they only run when
instrumentation is applied), so creating a new LiteLLMInstrumentor before the
is_instrumented_by_opentelemetry check does not change already-installed
wrappers or swap callbacks unexpectedly.
- Around line 115-122: The system resolution in _resolve_system_from_kwargs() is
returning raw LiteLLM provider prefixes instead of the published gen_ai.system
values. Update this helper to normalize aliases like azure and bedrock to the
constants used in opentelemetry.semconv_ai, and make sure the same normalized
value is what _start_span(), _record_metrics(), and _record_exception_metric()
use when custom_llm_provider is absent. Keep the fallback behavior for unknown
providers, but ensure the returned system key matches the repo’s semantic
convention mapping.

In `@packages/opentelemetry-instrumentation-litellm/tests/conftest.py`:
- Around line 71-83: The fixture setup leaks TRACELOOP_TRACE_CONTENT if
LiteLLMInstrumentor.instrument() fails before the yield, because the cleanup
after yield is skipped. Update the conftest.py fixture around
LiteLLMInstrumentor.instrument to guarantee cleanup with a try/finally block or
monkeypatch so os.environ.pop(TRACELOOP_TRACE_CONTENT, None) always runs even
when setup raises, and keep instrumentor.uninstrument() in the same guaranteed
cleanup path.

In `@packages/sample-app/sample_app/litellm_async_streaming.py`:
- Around line 173-178: The tool execution loop in litellm_async_streaming should
not abort when tc["arguments"] is malformed or when impl(**args) receives
invalid model-supplied parameters. Update the tool-call handling around the
tool_calls loop to catch JSON parsing and invocation errors, and convert them
into structured tool error messages (similar to the unknown tool fallback) so
tool_messages can still be returned and the final model turn can recover.
- Around line 110-118: The calculate() helper still uses eval(), and the current
character whitelist does not prevent expensive expressions like exponentiation
from model-controlled tool args. Replace the eval-based execution in calculate()
with a bounded AST-based parser that only allows the intended arithmetic
nodes/operators, and add explicit length/node limits so the expression is
rejected before evaluation if it is too large or complex.
- Around line 129-148: The _consume_stream function assumes every streamed chunk
has at least one choice, but the usage-only metadata chunk can have an empty
choices array and crash on chunk.choices[0]. Update _consume_stream in
litellm_async_streaming.py to guard against empty chunk.choices before accessing
delta, skipping those chunks while still accumulating content and tool_calls
from normal chunks.

In `@packages/sample-app/sample_app/litellm_completion.py`:
- Around line 153-158: The tool-call loop in litellm_completion should not abort
when tc.function.arguments is malformed or when impl(**args) receives bad
model-supplied parameters. Update the tool processing in the tool_calls loop to
catch json.loads and tool invocation failures, then return an error
payload/message for that specific tool call instead of raising, so the loop can
continue processing remaining calls. Use the existing tool registry flow around
TOOL_REGISTRY.get, tc.function.name, and the result assignment to centralize the
error handling.
- Line 16: The sample app dependency note only mentions OPENAI_API_KEY, but the
default Traceloop.init() flow also requires TRACELOOP_API_KEY for spans to be
emitted. Update the requirement/documentation in litellm_completion.py to
include TRACELOOP_API_KEY alongside OPENAI_API_KEY so the sample clearly states
all needed credentials. Refer to Traceloop.init and the existing “Requires” note
when making the change.

---

Outside diff comments:
In `@packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py`:
- Around line 72-87: The test cleanup in test_nesting.py is not robust if one
instrument() call fails after the other has already succeeded. Move the
OpenAIInstrumentor.instrument and LiteLLMInstrumentor.instrument setup inside
the try block in test_nesting, or track whether each instrumentor was
successfully enabled before calling uninstrument in finally, so partial setup
failures still guarantee teardown.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b11ec10-2133-4153-9039-151f86f44fb6

📥 Commits

Reviewing files that changed from the base of the PR and between fb292d0 and a8ad748.

⛔ Files ignored due to path filters (3)

packages/opentelemetry-instrumentation-litellm/uv.lock is excluded by !**/*.lock
packages/sample-app/uv.lock is excluded by !**/*.lock
packages/traceloop-sdk/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (21)

packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/config.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_models.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/utils.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/version.py
packages/opentelemetry-instrumentation-litellm/project.json
packages/opentelemetry-instrumentation-litellm/pyproject.toml
packages/opentelemetry-instrumentation-litellm/pytest.ini
packages/opentelemetry-instrumentation-litellm/tests/conftest.py
packages/opentelemetry-instrumentation-litellm/tests/test_completion.py
packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py
packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py
packages/sample-app/sample_app/litellm_async_streaming.py
packages/sample-app/sample_app/litellm_completion.py
packages/traceloop-sdk/pyproject.toml
packages/traceloop-sdk/traceloop/sdk/instruments.py
packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py

coderabbitai · 2026-06-25T12:58:36Z

+    def __init__(
+        self,
+        exception_logger=None,
+        use_legacy_attributes=True,
+        get_common_metrics_attributes=lambda: {},
+    ):
+        super().__init__()
+        Config.exception_logger = exception_logger
+        Config.use_legacy_attributes = use_legacy_attributes
+        Config.get_common_metrics_attributes = get_common_metrics_attributes


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Make LiteLLMInstrumentor.__init__() side-effect free.

These assignments mutate process-wide Config immediately. Because packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py, Lines 807-812 constructs a new LiteLLMInstrumentor before checking is_instrumented_by_opentelemetry, a second init call can silently flip already-installed wrappers between legacy-attribute mode and event mode, or replace the common metric attributes callback, without re-instrumenting anything.

Suggested fix

class LiteLLMInstrumentor(BaseInstrumentor): @@ def __init__( self, exception_logger=None, use_legacy_attributes=True, get_common_metrics_attributes=lambda: {}, ): super().__init__() - Config.exception_logger = exception_logger - Config.use_legacy_attributes = use_legacy_attributes - Config.get_common_metrics_attributes = get_common_metrics_attributes + self._exception_logger = exception_logger + self._use_legacy_attributes = use_legacy_attributes + self._get_common_metrics_attributes = get_common_metrics_attributes @@ def _instrument(self, **kwargs): + Config.exception_logger = self._exception_logger + Config.use_legacy_attributes = self._use_legacy_attributes + Config.get_common_metrics_attributes = self._get_common_metrics_attributes + tracer_provider = kwargs.get("tracer_provider") tracer = get_tracer(__name__, __version__, tracer_provider)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def __init__(

self,

exception_logger=None,

use_legacy_attributes=True,

get_common_metrics_attributes=lambda: {},

):

super().__init__()

Config.exception_logger = exception_logger

Config.use_legacy_attributes = use_legacy_attributes

Config.get_common_metrics_attributes = get_common_metrics_attributes

def __init__(

self,

exception_logger=None,

use_legacy_attributes=True,

get_common_metrics_attributes=lambda: {},

):

super().__init__()

self._exception_logger = exception_logger

self._use_legacy_attributes = use_legacy_attributes

self._get_common_metrics_attributes = get_common_metrics_attributes

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py` around lines 696 - 705, Make LiteLLMInstrumentor.__init__() side-effect free: it currently mutates process-wide Config state (exception_logger, use_legacy_attributes, and get_common_metrics_attributes) as soon as the instrumentor is constructed. Move those Config assignments out of __init__ and into the actual instrumentation path (or guard them so they only run when instrumentation is applied), so creating a new LiteLLMInstrumentor before the is_instrumented_by_opentelemetry check does not change already-installed wrappers or swap callbacks unexpectedly.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opentelemetry-instrumentation-litellm/README.md`:
- Around line 3-5: The README badge image is missing alt text, which triggers
markdownlint and hurts accessibility. Update the badge markup in the README so
the img element used for the PyPI badge includes descriptive alt text that
identifies the package badge, keeping the link and image placement unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b8e25a91-898c-42db-9f11-7407c9c3d20f

📥 Commits

Reviewing files that changed from the base of the PR and between a8ad748 and 879d384.

📒 Files selected for processing (1)

packages/opentelemetry-instrumentation-litellm/README.md

coderabbitai · 2026-06-25T13:01:11Z

+<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">
+    <img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg">
+</a>


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add alt text to the badge image.

This trips markdownlint and leaves the badge inaccessible to screen readers.

♿ Proposed fix

<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/"> - <img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg"> + <img + src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg" + alt="PyPI version" + > </a>

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">

<img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg">

</a>

<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">

<img

src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg"

alt="PyPI version"

>

</a>

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 4-4: Images should have alternate text (alt text)

(MD045, no-alt-text)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-litellm/README.md` around lines 3 - 5, The README badge image is missing alt text, which triggers markdownlint and hurts accessibility. Update the badge markup in the README so the img element used for the PyPI badge includes descriptive alt text that identifies the package badge, keeping the link and image placement unchanged.

Source: Linters/SAST tools

Bring the LiteLLM instrumentation up to the same semconv level as the openai/anthropic packages: - emit gen_ai.input.messages / gen_ai.output.messages (parts-based JSON schema) instead of the retired flat gen_ai.prompt.* / gen_ai.completion.* attributes - set gen_ai.response.finish_reasons (top-level, ungated, deduped) and map provider finish reasons to the OTel vocabulary (tool_calls -> tool_call); per-message finish_reason always a string - carry tool_calls + mapped finish_reason into gen_ai.choice events - replace deprecated gen_ai.system with gen_ai.provider.name and llm.request.type with gen_ai.operation.name, using upstream well-known values (fixes azure.ai.openai, splits gcp.vertex_ai vs gcp.gemini) - drop the now-unused GenAISystem.LITELLM enum member Tests updated to assert the JSON schema, provider/operation attributes, and finish-reason mapping/dedup; added event-mode tool_calls coverage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

CLAassistant · 2026-06-28T09:07:36Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ nina-kollman
❌ Gal Zilberstein

Gal Zilberstein seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-litellm/tests/test_events.py (1)
13-36: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

This test doesn't actually cover the use_legacy_attributes=False branch.

_emit_choice_events() is invoked directly here, and the supplied implementation for packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py:427-441 does not read Config.use_legacy_attributes. That means Lines 13-36 still pass even if the public-path gate for event mode regresses. Either drop the config-specific framing from this test or drive the public instrumentation path that consults the flag.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-litellm/tests/test_events.py` around
lines 13 - 36, The test is incorrectly framed as covering
Config.use_legacy_attributes=False because it calls _emit_choice_events()
directly, which bypasses the public path that checks the flag. Either remove the
config-specific assertion from test_events.py or rewrite the test to exercise
the public instrumentation entrypoint in
opentelemetry/instrumentation/litellm/__init__.py that actually consults
Config.use_legacy_attributes, so the test truly validates the branch.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`:
- Around line 583-597: The _finalize() path in the litellm instrumentation can
leak spans if _model_as_dict(), _handle_response(), or _record_metrics() raises
before span.end() is reached. Update _finalize() so the finalization work is
wrapped defensively and span.end() always executes, while preserving the
existing provider resolution, GenAIAttributes.GEN_AI_PROVIDER_NAME setting, and
Status(StatusCode.OK) behavior when the span is recording.

---

Nitpick comments:
In `@packages/opentelemetry-instrumentation-litellm/tests/test_events.py`:
- Around line 13-36: The test is incorrectly framed as covering
Config.use_legacy_attributes=False because it calls _emit_choice_events()
directly, which bypasses the public path that checks the flag. Either remove the
config-specific assertion from test_events.py or rewrite the test to exercise
the public instrumentation entrypoint in
opentelemetry/instrumentation/litellm/__init__.py that actually consults
Config.use_legacy_attributes, so the test truly validates the branch.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1ac59e8b-e88a-4f4b-8712-089c075c15a1

📥 Commits

Reviewing files that changed from the base of the PR and between 1793ab9 and 1860c82.

📒 Files selected for processing (8)

packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
packages/opentelemetry-instrumentation-litellm/tests/test_completion.py
packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
packages/opentelemetry-instrumentation-litellm/tests/test_events.py
packages/opentelemetry-instrumentation-litellm/tests/test_provider_resolution.py
packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py

🚧 Files skipped from review as they are similar to previous changes (5)

packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py
packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
packages/opentelemetry-instrumentation-litellm/tests/test_completion.py

coderabbitai · 2026-06-28T09:11:45Z

+def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):
+    provider = (
+        accumulated_provider
+        or _resolve_provider_from_response(response)
+        or _resolve_provider_from_kwargs(kwargs)
+    )
+    # Serialize the response once and reuse it for attributes, events, and metrics —
+    # model_dump() on a ModelResponse is the most expensive step in the wrapper.
+    response_dict = _model_as_dict(response)
+    _set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
+    _handle_response(span, request_type, response_dict, event_logger)
+    _record_metrics(metrics, provider, model, duration, response_dict, request_type)
+    if span.is_recording():
+        span.set_status(Status(StatusCode.OK))
+    span.end()


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Guarantee span closure when finalization work fails.

_finalize() ends the span only after response serialization, response/event handling, and metrics recording all succeed. If any instrumentation-only step raises, the span leaks and the successful LiteLLM response path can be turned into an application error. Wrap finalization work so span.end() always runs.

Suggested fix

def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None): - provider = ( - accumulated_provider - or _resolve_provider_from_response(response) - or _resolve_provider_from_kwargs(kwargs) - ) - # Serialize the response once and reuse it for attributes, events, and metrics — - # model_dump() on a ModelResponse is the most expensive step in the wrapper. - response_dict = _model_as_dict(response) - _set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider) - _handle_response(span, request_type, response_dict, event_logger) - _record_metrics(metrics, provider, model, duration, response_dict, request_type) - if span.is_recording(): - span.set_status(Status(StatusCode.OK)) - span.end() + try: + provider = ( + accumulated_provider + or _resolve_provider_from_response(response) + or _resolve_provider_from_kwargs(kwargs) + ) + # Serialize the response once and reuse it for attributes, events, and metrics — + # model_dump() on a ModelResponse is the most expensive step in the wrapper. + response_dict = _model_as_dict(response) + _set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider) + _handle_response(span, request_type, response_dict, event_logger) + _record_metrics(metrics, provider, model, duration, response_dict, request_type) + if span.is_recording(): + span.set_status(Status(StatusCode.OK)) + finally: + span.end()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):

provider = (

accumulated_provider

or _resolve_provider_from_response(response)

or _resolve_provider_from_kwargs(kwargs)

)

# Serialize the response once and reuse it for attributes, events, and metrics —

# model_dump() on a ModelResponse is the most expensive step in the wrapper.

response_dict = _model_as_dict(response)

_set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)

_handle_response(span, request_type, response_dict, event_logger)

_record_metrics(metrics, provider, model, duration, response_dict, request_type)

if span.is_recording():

span.set_status(Status(StatusCode.OK))

span.end()

def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):

try:

provider = (

accumulated_provider

or _resolve_provider_from_response(response)

or _resolve_provider_from_kwargs(kwargs)

)

# Serialize the response once and reuse it for attributes, events, and metrics —

# model_dump() on a ModelResponse is the most expensive step in the wrapper.

response_dict = _model_as_dict(response)

_set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)

_handle_response(span, request_type, response_dict, event_logger)

_record_metrics(metrics, provider, model, duration, response_dict, request_type)

if span.is_recording():

span.set_status(Status(StatusCode.OK))

finally:

span.end()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py` around lines 583 - 597, The _finalize() path in the litellm instrumentation can leak spans if _model_as_dict(), _handle_response(), or _record_metrics() raises before span.end() is reached. Update _finalize() so the finalization work is wrapped defensively and span.end() always executes, while preserving the existing provider resolution, GenAIAttributes.GEN_AI_PROVIDER_NAME setting, and Status(StatusCode.OK) behavior when the span is recording.

init instrumentation

b31d771

nina-kollman added 3 commits June 25, 2026 11:54

add sample app

c3a70ba

wip

6331125

added tool to input

8711cdc

nina-kollman changed the title ~~init instrumentation~~ feat(instrumentation): Add litellm package instrumentation Jun 25, 2026

nina-kollman added 2 commits June 25, 2026 14:25

simplify

066e140

pr review

a8ad748

nina-kollman marked this pull request as ready for review June 25, 2026 12:46

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

nina-kollman and others added 3 commits June 25, 2026 15:13

comments

7d38919

try trace change

1793ab9

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

Uh oh!

Conversation

nina-kollman commented Jun 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Jun 28, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nina-kollman commented Jun 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading