Skip to content

Commit d1e8154

Browse files
lovasoaclaude
andauthored
Add OpenTelemetry distributed tracing support (#1234)
* add OpenTelemetry distributed tracing support When OTEL_EXPORTER_OTLP_ENDPOINT is set, enables full tracing pipeline with OTLP export, W3C traceparent propagation, and spans for HTTP requests, SQL file execution, DB pool acquire, and query execution. Falls back to env_logger when unset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix OTel example: tracing init, Dockerfile, Tempo config - Fix tracing-log bridge initialization order (set subscriber first, then LogTracer) to avoid double-set panic - Add dedicated Dockerfile for example using release profile (avoids OOM with superoptimized LTO in Docker) - Use debian:trixie-slim runtime for glibc compatibility - Fix nginx image to nginx:otel (official image with OTel module) - Fix nginx.conf: move otel_trace directives into location block - Pin Tempo to 2.6.1 (latest has partition ring issues in single-node) - Fix otel-collector exporter alias (otlp → otlp_grpc) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * propagate trace context to PostgreSQL via application_name After acquiring a DB connection, set the W3C traceparent as the PostgreSQL application_name (or MySQL session variable). This makes trace IDs visible in pg_stat_activity and PostgreSQL logs, enabling direct correlation between Grafana Tempo traces and database-side monitoring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * rewrite OTel example README with setup guides for all major providers Comprehensive documentation covering: - Step-by-step quick start for the Docker Compose example - How OpenTelemetry works (spans, collectors, backends) - Setup guides for Grafana Tempo, Jaeger, Grafana Cloud, Datadog, Honeycomb, New Relic, and Axiom with exact env vars and doc links - PostgreSQL trace correlation via application_name - Environment variable reference - Troubleshooting section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix todo example: use :title (POST variable) instead of $title The form submits via POST, so the title field must be referenced with the : prefix (POST parameter) rather than $ (GET parameter). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * set code.filepath and code.lineno span attributes to user SQL files OTel span attributes now reference the user's .sql file path and line number instead of the SQLPage Rust source code. Also improves span naming, adds JSON log formatting, custom root span builder, and Grafana dashboard provisioning for the OTel example. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * npm run fix * use stable OTel semantic convention attribute names - code.filepath → code.file.path, code.lineno → code.line.number - db.statement → db.query.text, db.system → db.system.name - Disable auto code location (.with_location(false)) so spans reference user SQL files, not SQLPage Rust source - Remove redundant sqlpage.file attribute (code.file.path suffices) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * use OTel semantic convention values for db.system.name Use well-known values from the OpenTelemetry registry instead of raw DBMS name strings. Cast line numbers to i64 for correct span recording. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * remove json-subscriber dependency The custom logfmt layer in telemetry.rs replaces it with zero extra dependencies and precise control over field selection and ordering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add Loki + Promtail log aggregation to OTel example Adds two new services (Loki, Promtail) to scrape SQLPage container logs and display them in Grafana alongside traces. The home dashboard now shows a logs panel with trace_id derived fields linking to Tempo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add OTel spans for request parsing, rendering, sqlpage functions, and OIDC Add targeted spans to account for previously untraced time: - http.parse_request: request/form parsing before SQL execution - render: template rendering and response streaming - subprocess: sqlpage.exec() with process.command attribute - http.client: sqlpage.fetch()/fetch_with_meta() with OTel HTTP client semantic conventions (http.request.method, url.full, http.response.status_code) - sqlpage.file: sqlpage.run_sql() nested file execution - oidc.callback + http.client: OIDC token exchange Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add oidc.jwt.verify span for OIDC token verification This span covers JWT signature verification and claims validation, which runs on every authenticated request via get_token_claims(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add enduser.id attribute to oidc.jwt.verify span Records the OIDC subject claim (sub) as enduser.id after successful JWT verification, following OTel semantic conventions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add OTel user.* attributes to oidc.jwt.verify span Record user.id (sub), user.name (preferred_username), user.full_name (name), and user.email from OIDC claims after JWT verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add sqlpage.file.load span and attributes to http.parse_request The gap before http.parse_request was the SQL file cache lookup - now covered by the sqlpage.file.load span with code.file.path. http.parse_request now records http.request.method and content_type, which helps identify slow multipart/form-data parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix clippy pedantic warnings * unify log format: logfmt with colors, no OTel noise Use the custom logfmt layer for both OTel and non-OTel modes instead of falling back to env_logger. This eliminates the tracing→log bridge dumping all span fields (user agents, otel.kind, request_id, etc.) and only shows: ts, level, target, msg, method, path, status, file, client_ip, and trace_id (when valid). Adds terminal color support (bold red for errors, green for info, dim for timestamps/targets). Emits one log line per completed successful request. Errors are logged once by the error handler. Suppresses trace_id=000...0 when no real trace context exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * use .instrument() instead of .entered() for async spans Span guards from .entered() do not propagate correctly across await points. Switch to tracing::Instrument to ensure spans are properly associated with their async tasks throughout their lifetime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * preserve multi-line error formatting in terminal log output When stderr is a terminal and the log message contains newlines (e.g. SQL syntax errors with source highlighting and arrows), print the metadata on the first line and the message below with its original formatting. Machine output (non-terminal) remains single-line logfmt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * use root Dockerfile for OTel example, add CARGO_PROFILE build arg Remove the example's custom Dockerfile and use the main one with a CARGO_PROFILE=release build arg to avoid OOM from fat LTO in memory-constrained Docker environments. The build scripts now read CARGO_PROFILE from the environment, defaulting to superoptimized for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * add db.query.parameter and db.response.returned_rows span attributes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add official blog post about tracing * add http.request.body.size and url.query span attributes Add http.request.body.size to HTTP client spans (fetch, fetch_with_meta) and to the server-side http.parse_request span (from Content-Length header). Add url.query to the http.parse_request span. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace Promtail with the OpenTelemetry Collector * Refactor telemetry logging helpers * Clamp traced fetch body size * Clamp traced fetch_with_meta body size * Silence noisy PostgreSQL collector logs * make startup logs parseable * update terminal log formats * Ingest real PostgreSQL logs with trace IDs * Use raw traceparent for PostgreSQL tracing * Update opentelemetry example for PostgreSQL query events * Add nginx logs to opentelemetry example * Parse nginx error log severity correctly * Log all span fields when debug logging is enabled * Rename telemetry example directory * Fix PostgreSQL Loki log ingestion * `LOG_LEVEL` is now the primary environment variable for configuring SQLPage's log filter. `RUST_LOG` remains supported as an alias. * Skip empty trace IDs in logs * add db errors to otel traces * add healthcheck --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 99ad268 commit d1e8154

32 files changed

Lines changed: 3178 additions & 178 deletions

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
- Fixed a bug where the single-sign-on oidc code would generate an unbounded amount of cookies when receiving many unauthenticated requests in sequence.
66
- Fix: invalid UTF-8 in multipart text fields now returns `400 Bad Request` instead of `500 Internal Server Error`.
7+
- Logging: `LOG_LEVEL` is now the primary environment variable for configuring SQLPage's log filter. `RUST_LOG` remains supported as an alias.
78

89
## 0.43.0
910

0 commit comments

Comments
 (0)