Skip to content

feat(instrumentation): Add litellm package instrumentation#4319

Open
nina-kollman wants to merge 10 commits into
mainfrom
nk/litellm_instrumentation
Open

feat(instrumentation): Add litellm package instrumentation#4319
nina-kollman wants to merge 10 commits into
mainfrom
nk/litellm_instrumentation

Conversation

@nina-kollman

@nina-kollman nina-kollman commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Fixes: TLP-2379

image
  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • (If applicable) I have updated the documentation accordingly.

Summary by CodeRabbit

  • New Features

    • Added LiteLLM observability support with spans, events, metrics, and streaming coverage for chat and embedding calls.
    • Added provider-name normalization and richer prompt/response details, including tool-call handling and async support.
    • Added LiteLLM examples for synchronous and async tool-calling workflows.
  • Bug Fixes

    • Improved span finalization for streaming flows so traces close correctly on early exits, errors, and cleanup.
  • Documentation

    • Added setup and usage guidance for the new LiteLLM instrumentation.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces a new opentelemetry-instrumentation-litellm package that wraps LiteLLM completion, acompletion, embedding, and aembedding to emit gen_ai.* spans, metrics, and OTel events. Adds streaming proxy classes for exactly-once span finalization, wires LiteLLMInstrumentor into the Traceloop SDK, and includes tests plus sample tool-calling apps.

LiteLLM OpenTelemetry instrumentation

Layer / File(s) Summary
Contracts and feature flags
opentelemetry/instrumentation/litellm/config.py, .../event_models.py, .../utils.py, .../version.py
Config class, MessageEvent/ChoiceEvent/ToolCall TypedDicts, should_send_prompts, is_metrics_enabled, should_emit_events, dont_throw, and __version__ are defined as shared foundation types used throughout the instrumentation.
Request/response attribute capture and event emission
opentelemetry/instrumentation/litellm/__init__.py (lines 1–515), .../event_emitter.py
Declares WRAPPED_METHODS, resolves provider names from model prefixes or custom_llm_provider, records request/response span attributes (model, streaming, tokens, finish reasons, serialized messages with tool calls), emits structured OTel LogRecord events, and builds histogram/counter metrics instruments.
Streaming accumulation and finalization
opentelemetry/instrumentation/litellm/__init__.py (lines 520–837)
Accumulates per-chunk deltas into indexed choice/tool-call state, reconstructs final ModelResponse, and wraps sync/async streams with _StreamWrapper/_AsyncStreamWrapper to guarantee exactly-once span finalization on normal exhaustion, early break, consumer error, close/aclose, and __del__.
LiteLLMInstrumentor, SDK wiring, and package config
opentelemetry/instrumentation/litellm/__init__.py (lines 839–889), traceloop/sdk/tracing/tracing.py, traceloop/sdk/instruments.py, pyproject.toml, project.json, README.md
LiteLLMInstrumentor configures Config, creates tracer/meter/event-logger, and wraps/unwraps methods. SDK gains Instruments.LITELLM, init_litellm_instrumentor, and a span-attribute unfreeze patch for postprocess callbacks. Package metadata, Nx targets, and docs are also added.
Baseline tracing, event, and metrics tests
tests/conftest.py, tests/test_completion.py, tests/test_embeddings.py, tests/test_events.py, tests/test_provider_resolution.py, tests/test_custom_provider.py
In-memory OTel fixture setup; covers sync/async completion and embedding spans, serialized input/output messages with tool calls, finish-reason normalization and deduplication, metrics emission, prompt-content suppression, OTel choice events with tool calls, provider-prefix resolution, and custom-provider suppression key verification.
Streaming and nesting tests
tests/test_streaming.py, tests/test_nesting.py
Verifies streaming span lifecycle (normal, early break, consumer error, async), CustomStreamWrapper preservation, nested re-entrant LiteLLM call suppression (single span), and coexistence with OpenAI instrumentation.
Sample tool-calling workflows
packages/sample-app/sample_app/litellm_completion.py, .../litellm_async_streaming.py
Synchronous and async agentic loop examples using LiteLLM tool calling with Traceloop initialization, local tool dispatch, and streamed final answers.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • traceloop/openllmetry#4133: Threads the SDK-level use_attributes/use_legacy_attributes setting into the LiteLLM instrumentor's use_legacy_attributes config, which is the same flag wired by this PR.

Suggested reviewers

  • doronkopit5
  • max-deygin-traceloop

🐇 A brand new package hops into the fold,
Wrapping LiteLLM in spans of gold.
Streaming chunks accumulate with care,
Tool calls and finish reasons laid bare.
From litellm.chat to metrics galore—
This bunny traces every LLM score! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.39% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding LiteLLM package instrumentation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nk/litellm_instrumentation

Comment @coderabbitai help to get the list of available commands.

@nina-kollman nina-kollman changed the title init instrumentation feat(instrumentation): Add litellm package instrumentation Jun 25, 2026
@nina-kollman nina-kollman marked this pull request as ready for review June 25, 2026 12:46
pyproject declares `readme = "README.md"`, but the file was absent, so the
hatchling editable build failed with `OSError: Readme file does not exist`,
breaking the Lint, Build, and Test CI jobs at the install step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py (1)

72-87: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make uninstrument cleanup robust to partial setup failures.

At Line 74-76, both .instrument(...) calls happen before the try. If Line 75 raises, Line 86 won’t run for the already-instrumented OpenAI instrumentor. Move instrumentation inside try (or track success flags) to guarantee teardown.

Suggested fix
     openai_instrumentor = OpenAIInstrumentor()
     litellm_instrumentor = LiteLLMInstrumentor()
-    openai_instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider)
-    litellm_instrumentor.instrument(tracer_provider=tracer_provider, meter_provider=meter_provider)
     try:
+        openai_instrumentor.instrument(
+            tracer_provider=tracer_provider, meter_provider=meter_provider
+        )
+        litellm_instrumentor.instrument(
+            tracer_provider=tracer_provider, meter_provider=meter_provider
+        )
         litellm.completion(
             model="gpt-3.5-turbo",
             messages=MESSAGES,
             mock_response="hello",
         )
         spans = span_exporter.get_finished_spans()
         assert [span.name for span in spans] == ["litellm.chat"]
     finally:
         litellm_instrumentor.uninstrument()
         openai_instrumentor.uninstrument()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py` around
lines 72 - 87, The test cleanup in test_nesting.py is not robust if one
instrument() call fails after the other has already succeeded. Move the
OpenAIInstrumentor.instrument and LiteLLMInstrumentor.instrument setup inside
the try block in test_nesting, or track whether each instrumentor was
successfully enabled before calling uninstrument in finally, so partial setup
failures still guarantee teardown.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`:
- Around line 696-705: Make LiteLLMInstrumentor.__init__() side-effect free: it
currently mutates process-wide Config state (exception_logger,
use_legacy_attributes, and get_common_metrics_attributes) as soon as the
instrumentor is constructed. Move those Config assignments out of __init__ and
into the actual instrumentation path (or guard them so they only run when
instrumentation is applied), so creating a new LiteLLMInstrumentor before the
is_instrumented_by_opentelemetry check does not change already-installed
wrappers or swap callbacks unexpectedly.
- Around line 115-122: The system resolution in _resolve_system_from_kwargs() is
returning raw LiteLLM provider prefixes instead of the published gen_ai.system
values. Update this helper to normalize aliases like azure and bedrock to the
constants used in opentelemetry.semconv_ai, and make sure the same normalized
value is what _start_span(), _record_metrics(), and _record_exception_metric()
use when custom_llm_provider is absent. Keep the fallback behavior for unknown
providers, but ensure the returned system key matches the repo’s semantic
convention mapping.

In `@packages/opentelemetry-instrumentation-litellm/tests/conftest.py`:
- Around line 71-83: The fixture setup leaks TRACELOOP_TRACE_CONTENT if
LiteLLMInstrumentor.instrument() fails before the yield, because the cleanup
after yield is skipped. Update the conftest.py fixture around
LiteLLMInstrumentor.instrument to guarantee cleanup with a try/finally block or
monkeypatch so os.environ.pop(TRACELOOP_TRACE_CONTENT, None) always runs even
when setup raises, and keep instrumentor.uninstrument() in the same guaranteed
cleanup path.

In `@packages/sample-app/sample_app/litellm_async_streaming.py`:
- Around line 173-178: The tool execution loop in litellm_async_streaming should
not abort when tc["arguments"] is malformed or when impl(**args) receives
invalid model-supplied parameters. Update the tool-call handling around the
tool_calls loop to catch JSON parsing and invocation errors, and convert them
into structured tool error messages (similar to the unknown tool fallback) so
tool_messages can still be returned and the final model turn can recover.
- Around line 110-118: The calculate() helper still uses eval(), and the current
character whitelist does not prevent expensive expressions like exponentiation
from model-controlled tool args. Replace the eval-based execution in calculate()
with a bounded AST-based parser that only allows the intended arithmetic
nodes/operators, and add explicit length/node limits so the expression is
rejected before evaluation if it is too large or complex.
- Around line 129-148: The _consume_stream function assumes every streamed chunk
has at least one choice, but the usage-only metadata chunk can have an empty
choices array and crash on chunk.choices[0]. Update _consume_stream in
litellm_async_streaming.py to guard against empty chunk.choices before accessing
delta, skipping those chunks while still accumulating content and tool_calls
from normal chunks.

In `@packages/sample-app/sample_app/litellm_completion.py`:
- Around line 153-158: The tool-call loop in litellm_completion should not abort
when tc.function.arguments is malformed or when impl(**args) receives bad
model-supplied parameters. Update the tool processing in the tool_calls loop to
catch json.loads and tool invocation failures, then return an error
payload/message for that specific tool call instead of raising, so the loop can
continue processing remaining calls. Use the existing tool registry flow around
TOOL_REGISTRY.get, tc.function.name, and the result assignment to centralize the
error handling.
- Line 16: The sample app dependency note only mentions OPENAI_API_KEY, but the
default Traceloop.init() flow also requires TRACELOOP_API_KEY for spans to be
emitted. Update the requirement/documentation in litellm_completion.py to
include TRACELOOP_API_KEY alongside OPENAI_API_KEY so the sample clearly states
all needed credentials. Refer to Traceloop.init and the existing “Requires” note
when making the change.

---

Outside diff comments:
In `@packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py`:
- Around line 72-87: The test cleanup in test_nesting.py is not robust if one
instrument() call fails after the other has already succeeded. Move the
OpenAIInstrumentor.instrument and LiteLLMInstrumentor.instrument setup inside
the try block in test_nesting, or track whether each instrumentor was
successfully enabled before calling uninstrument in finally, so partial setup
failures still guarantee teardown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b11ec10-2133-4153-9039-151f86f44fb6

📥 Commits

Reviewing files that changed from the base of the PR and between fb292d0 and a8ad748.

⛔ Files ignored due to path filters (3)
  • packages/opentelemetry-instrumentation-litellm/uv.lock is excluded by !**/*.lock
  • packages/sample-app/uv.lock is excluded by !**/*.lock
  • packages/traceloop-sdk/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/config.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_models.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/utils.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/version.py
  • packages/opentelemetry-instrumentation-litellm/project.json
  • packages/opentelemetry-instrumentation-litellm/pyproject.toml
  • packages/opentelemetry-instrumentation-litellm/pytest.ini
  • packages/opentelemetry-instrumentation-litellm/tests/conftest.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_completion.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_nesting.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py
  • packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py
  • packages/sample-app/sample_app/litellm_async_streaming.py
  • packages/sample-app/sample_app/litellm_completion.py
  • packages/traceloop-sdk/pyproject.toml
  • packages/traceloop-sdk/traceloop/sdk/instruments.py
  • packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py

Comment on lines +696 to +705
def __init__(
self,
exception_logger=None,
use_legacy_attributes=True,
get_common_metrics_attributes=lambda: {},
):
super().__init__()
Config.exception_logger = exception_logger
Config.use_legacy_attributes = use_legacy_attributes
Config.get_common_metrics_attributes = get_common_metrics_attributes

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Make LiteLLMInstrumentor.__init__() side-effect free.

These assignments mutate process-wide Config immediately. Because packages/traceloop-sdk/traceloop/sdk/tracing/tracing.py, Lines 807-812 constructs a new LiteLLMInstrumentor before checking is_instrumented_by_opentelemetry, a second init call can silently flip already-installed wrappers between legacy-attribute mode and event mode, or replace the common metric attributes callback, without re-instrumenting anything.

Suggested fix
 class LiteLLMInstrumentor(BaseInstrumentor):
@@
     def __init__(
         self,
         exception_logger=None,
         use_legacy_attributes=True,
         get_common_metrics_attributes=lambda: {},
     ):
         super().__init__()
-        Config.exception_logger = exception_logger
-        Config.use_legacy_attributes = use_legacy_attributes
-        Config.get_common_metrics_attributes = get_common_metrics_attributes
+        self._exception_logger = exception_logger
+        self._use_legacy_attributes = use_legacy_attributes
+        self._get_common_metrics_attributes = get_common_metrics_attributes
@@
     def _instrument(self, **kwargs):
+        Config.exception_logger = self._exception_logger
+        Config.use_legacy_attributes = self._use_legacy_attributes
+        Config.get_common_metrics_attributes = self._get_common_metrics_attributes
+
         tracer_provider = kwargs.get("tracer_provider")
         tracer = get_tracer(__name__, __version__, tracer_provider)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(
self,
exception_logger=None,
use_legacy_attributes=True,
get_common_metrics_attributes=lambda: {},
):
super().__init__()
Config.exception_logger = exception_logger
Config.use_legacy_attributes = use_legacy_attributes
Config.get_common_metrics_attributes = get_common_metrics_attributes
def __init__(
self,
exception_logger=None,
use_legacy_attributes=True,
get_common_metrics_attributes=lambda: {},
):
super().__init__()
self._exception_logger = exception_logger
self._use_legacy_attributes = use_legacy_attributes
self._get_common_metrics_attributes = get_common_metrics_attributes
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`
around lines 696 - 705, Make LiteLLMInstrumentor.__init__() side-effect free: it
currently mutates process-wide Config state (exception_logger,
use_legacy_attributes, and get_common_metrics_attributes) as soon as the
instrumentor is constructed. Move those Config assignments out of __init__ and
into the actual instrumentation path (or guard them so they only run when
instrumentation is applied), so creating a new LiteLLMInstrumentor before the
is_instrumented_by_opentelemetry check does not change already-installed
wrappers or swap callbacks unexpectedly.

Comment thread packages/opentelemetry-instrumentation-litellm/tests/conftest.py Outdated
Comment thread packages/sample-app/sample_app/litellm_async_streaming.py
Comment thread packages/sample-app/sample_app/litellm_async_streaming.py
Comment thread packages/sample-app/sample_app/litellm_async_streaming.py Outdated
Comment thread packages/sample-app/sample_app/litellm_completion.py Outdated
Comment thread packages/sample-app/sample_app/litellm_completion.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opentelemetry-instrumentation-litellm/README.md`:
- Around line 3-5: The README badge image is missing alt text, which triggers
markdownlint and hurts accessibility. Update the badge markup in the README so
the img element used for the PyPI badge includes descriptive alt text that
identifies the package badge, keeping the link and image placement unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b8e25a91-898c-42db-9f11-7407c9c3d20f

📥 Commits

Reviewing files that changed from the base of the PR and between a8ad748 and 879d384.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-litellm/README.md

Comment on lines +3 to +5
<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">
<img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg">
</a>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add alt text to the badge image.

This trips markdownlint and leaves the badge inaccessible to screen readers.

♿ Proposed fix
 <a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">
-    <img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg">
+    <img
+        src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg"
+        alt="PyPI version"
+    >
 </a>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">
<img src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg">
</a>
<a href="https://pypi.org/project/opentelemetry-instrumentation-litellm/">
<img
src="https://badge.fury.io/py/opentelemetry-instrumentation-litellm.svg"
alt="PyPI version"
>
</a>
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 4-4: Images should have alternate text (alt text)

(MD045, no-alt-text)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-litellm/README.md` around lines 3 - 5,
The README badge image is missing alt text, which triggers markdownlint and
hurts accessibility. Update the badge markup in the README so the img element
used for the PyPI badge includes descriptive alt text that identifies the
package badge, keeping the link and image placement unchanged.

Source: Linters/SAST tools

nina-kollman and others added 3 commits June 25, 2026 15:13
Bring the LiteLLM instrumentation up to the same semconv level as the
openai/anthropic packages:

- emit gen_ai.input.messages / gen_ai.output.messages (parts-based JSON
  schema) instead of the retired flat gen_ai.prompt.* / gen_ai.completion.*
  attributes
- set gen_ai.response.finish_reasons (top-level, ungated, deduped) and map
  provider finish reasons to the OTel vocabulary (tool_calls -> tool_call);
  per-message finish_reason always a string
- carry tool_calls + mapped finish_reason into gen_ai.choice events
- replace deprecated gen_ai.system with gen_ai.provider.name and
  llm.request.type with gen_ai.operation.name, using upstream well-known
  values (fixes azure.ai.openai, splits gcp.vertex_ai vs gcp.gemini)
- drop the now-unused GenAISystem.LITELLM enum member

Tests updated to assert the JSON schema, provider/operation attributes, and
finish-reason mapping/dedup; added event-mode tool_calls coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ nina-kollman
❌ Gal Zilberstein


Gal Zilberstein seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-litellm/tests/test_events.py (1)

13-36: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

This test doesn't actually cover the use_legacy_attributes=False branch.

_emit_choice_events() is invoked directly here, and the supplied implementation for packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py:427-441 does not read Config.use_legacy_attributes. That means Lines 13-36 still pass even if the public-path gate for event mode regresses. Either drop the config-specific framing from this test or drive the public instrumentation path that consults the flag.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opentelemetry-instrumentation-litellm/tests/test_events.py` around
lines 13 - 36, The test is incorrectly framed as covering
Config.use_legacy_attributes=False because it calls _emit_choice_events()
directly, which bypasses the public path that checks the flag. Either remove the
config-specific assertion from test_events.py or rewrite the test to exercise
the public instrumentation entrypoint in
opentelemetry/instrumentation/litellm/__init__.py that actually consults
Config.use_legacy_attributes, so the test truly validates the branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`:
- Around line 583-597: The _finalize() path in the litellm instrumentation can
leak spans if _model_as_dict(), _handle_response(), or _record_metrics() raises
before span.end() is reached. Update _finalize() so the finalization work is
wrapped defensively and span.end() always executes, while preserving the
existing provider resolution, GenAIAttributes.GEN_AI_PROVIDER_NAME setting, and
Status(StatusCode.OK) behavior when the span is recording.

---

Nitpick comments:
In `@packages/opentelemetry-instrumentation-litellm/tests/test_events.py`:
- Around line 13-36: The test is incorrectly framed as covering
Config.use_legacy_attributes=False because it calls _emit_choice_events()
directly, which bypasses the public path that checks the flag. Either remove the
config-specific assertion from test_events.py or rewrite the test to exercise
the public instrumentation entrypoint in
opentelemetry/instrumentation/litellm/__init__.py that actually consults
Config.use_legacy_attributes, so the test truly validates the branch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1ac59e8b-e88a-4f4b-8712-089c075c15a1

📥 Commits

Reviewing files that changed from the base of the PR and between 1793ab9 and 1860c82.

📒 Files selected for processing (8)
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_completion.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_events.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_provider_resolution.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • packages/opentelemetry-instrumentation-litellm/tests/test_embeddings.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_streaming.py
  • packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/event_emitter.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_custom_provider.py
  • packages/opentelemetry-instrumentation-litellm/tests/test_completion.py

Comment on lines +583 to +597
def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):
provider = (
accumulated_provider
or _resolve_provider_from_response(response)
or _resolve_provider_from_kwargs(kwargs)
)
# Serialize the response once and reuse it for attributes, events, and metrics —
# model_dump() on a ModelResponse is the most expensive step in the wrapper.
response_dict = _model_as_dict(response)
_set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
_handle_response(span, request_type, response_dict, event_logger)
_record_metrics(metrics, provider, model, duration, response_dict, request_type)
if span.is_recording():
span.set_status(Status(StatusCode.OK))
span.end()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Guarantee span closure when finalization work fails.

_finalize() ends the span only after response serialization, response/event handling, and metrics recording all succeed. If any instrumentation-only step raises, the span leaks and the successful LiteLLM response path can be turned into an application error. Wrap finalization work so span.end() always runs.

Suggested fix
 def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):
-    provider = (
-        accumulated_provider
-        or _resolve_provider_from_response(response)
-        or _resolve_provider_from_kwargs(kwargs)
-    )
-    # Serialize the response once and reuse it for attributes, events, and metrics —
-    # model_dump() on a ModelResponse is the most expensive step in the wrapper.
-    response_dict = _model_as_dict(response)
-    _set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
-    _handle_response(span, request_type, response_dict, event_logger)
-    _record_metrics(metrics, provider, model, duration, response_dict, request_type)
-    if span.is_recording():
-        span.set_status(Status(StatusCode.OK))
-    span.end()
+    try:
+        provider = (
+            accumulated_provider
+            or _resolve_provider_from_response(response)
+            or _resolve_provider_from_kwargs(kwargs)
+        )
+        # Serialize the response once and reuse it for attributes, events, and metrics —
+        # model_dump() on a ModelResponse is the most expensive step in the wrapper.
+        response_dict = _model_as_dict(response)
+        _set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
+        _handle_response(span, request_type, response_dict, event_logger)
+        _record_metrics(metrics, provider, model, duration, response_dict, request_type)
+        if span.is_recording():
+            span.set_status(Status(StatusCode.OK))
+    finally:
+        span.end()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):
provider = (
accumulated_provider
or _resolve_provider_from_response(response)
or _resolve_provider_from_kwargs(kwargs)
)
# Serialize the response once and reuse it for attributes, events, and metrics —
# model_dump() on a ModelResponse is the most expensive step in the wrapper.
response_dict = _model_as_dict(response)
_set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
_handle_response(span, request_type, response_dict, event_logger)
_record_metrics(metrics, provider, model, duration, response_dict, request_type)
if span.is_recording():
span.set_status(Status(StatusCode.OK))
span.end()
def _finalize(span, metrics, event_logger, request_type, response, model, kwargs, duration, accumulated_provider=None):
try:
provider = (
accumulated_provider
or _resolve_provider_from_response(response)
or _resolve_provider_from_kwargs(kwargs)
)
# Serialize the response once and reuse it for attributes, events, and metrics —
# model_dump() on a ModelResponse is the most expensive step in the wrapper.
response_dict = _model_as_dict(response)
_set_span_attribute(span, GenAIAttributes.GEN_AI_PROVIDER_NAME, provider)
_handle_response(span, request_type, response_dict, event_logger)
_record_metrics(metrics, provider, model, duration, response_dict, request_type)
if span.is_recording():
span.set_status(Status(StatusCode.OK))
finally:
span.end()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/opentelemetry-instrumentation-litellm/opentelemetry/instrumentation/litellm/__init__.py`
around lines 583 - 597, The _finalize() path in the litellm instrumentation can
leak spans if _model_as_dict(), _handle_response(), or _record_metrics() raises
before span.end() is reached. Update _finalize() so the finalization work is
wrapped defensively and span.end() always executes, while preserving the
existing provider resolution, GenAIAttributes.GEN_AI_PROVIDER_NAME setting, and
Status(StatusCode.OK) behavior when the span is recording.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants