Skip to content

Commit 81bb305

Browse files
typotterdevflow.devflow-routing-intake
andauthored
Add flag evaluation metrics via OTel counter and OpenFeature Hook (#11040)
Add flag evaluation metrics via OTel counter and OpenFeature Hook Record a `feature_flag.evaluations` OTel counter on every flag evaluation using an OpenFeature `finallyAfter` hook. The hook captures all evaluation paths including type mismatches that occur above the provider level. Attributes: feature_flag.key, feature_flag.result.variant, feature_flag.result.reason, error.type (on error), feature_flag.result.allocation_key (when present). Counter is a no-op when DD_METRICS_OTEL_ENABLED is false or opentelemetry-api is absent from the classpath. Use own SdkMeterProvider with OTLP HTTP exporter for eval metrics Replace GlobalOpenTelemetry.getMeterProvider() with a dedicated SdkMeterProvider + OtlpHttpMetricExporter that sends metrics directly to the DD Agent's OTLP endpoint (default :4318/v1/metrics). This avoids the agent's OTel class shading issue where the agent relocates io.opentelemetry.api.* to datadog.trace.bootstrap.otel.api.*, making GlobalOpenTelemetry calls from the dd-openfeature jar hit the unshaded no-op provider instead of the agent's shim. Requires opentelemetry-sdk-metrics and opentelemetry-exporter-otlp on the application classpath. Falls back to no-op if absent. System tests: 11/17 pass. 6 failures are pre-existing DDEvaluator gaps (reason mapping, parse errors, type mismatch strictness). Address code review feedback for eval metrics - Add explicit null guard for details in FlagEvalHook.finallyAfter() - Add OTEL_EXPORTER_OTLP_ENDPOINT generic env var fallback with /v1/metrics path appended (per OTel spec fallback chain) - Add comments clarifying signal-specific vs generic endpoint behavior Fix NoClassDefFoundError when OTel SDK absent from classpath When the OTel SDK jars are not on the application classpath, loading FlagEvalMetrics fails because field types reference OTel SDK classes (SdkMeterProvider). This propagated as an uncaught NoClassDefFoundError from the Provider constructor, crashing provider initialization. Fix: - Change meterProvider field type from SdkMeterProvider to Closeable (always on classpath), use local SdkMeterProvider variable inside try block - Catch NoClassDefFoundError in Provider constructor when creating FlagEvalMetrics - Null-safe getProviderHooks() and shutdown() when metrics is null Move FlagEvalHook construction inside try/catch block FlagEvalHook references FlagEvalMetrics in its field declaration. On JVMs that eagerly verify field types during class loading, constructing FlagEvalHook outside the try/catch could throw NoClassDefFoundError if OTel classes failed to load. Moving it inside the try block ensures both metrics and hook are null-safe when OTel is absent. Add README for dd-openfeature with eval metrics setup Documents the published artifact setup, evaluation metrics dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp), OTLP endpoint configuration, metric attributes, and requirements. Use ConfigHelper.env() instead of System.getenv() System.getenv() is forbidden by the project's forbiddenApis rules. Replace with ConfigHelper.env() which is the approved way to read environment variables. Add config-utils as compileOnly dependency. Address PR review feedback from manuel-alvarez-alvarez - Remove transitive openfeature-sdk dep from README setup section - Import ErrorCode at top of FlagEvalHook instead of inline FQN Merge remote-tracking branch 'origin/master' into typo/evaluations-logging Add evaluationLogging option and log errors when OTel SDK absent - Add Options.evaluationLogging(boolean) — default true per EVALLOG.12 - When disabled: no metrics, no hook, no error - When enabled + OTel SDK missing: log.error with instructions to add deps or disable, degrade to no-op (matches Go/Python pattern) - When enabled + OTel init failure: log.error with message, degrade - Remove silent catch — FlagEvalMetrics now logs at error level for NoClassDefFoundError and at error level for other init failures Fix feature_flag.evaluations metric count always being zero The OTel SDK defaults to DELTA temporality for counters. The Datadog agent converts OTLP delta monotonic sums to rate metrics by dividing by the export interval (10s). Five evaluations in under 1s produce ~0.5, which rounds to zero in the points payload. Force CUMULATIVE temporality on the OtlpHttpMetricExporter so the agent receives an absolute count rather than a rate, making test_ffe_eval_metric_count reliable. test(openfeature): verify cumulative temporality and count accumulation in FlagEvalMetrics Address internal review feedback - Remove exporterIsConfiguredWithCumulativeTemporalityForCounters test (tested OTel SDK, not our code; the integration test is the real regression guard) - Fix Provider catch block comment to reflect that FlagEvalMetrics may not have logged if we reach this point - Include exception in log.error calls for NoClassDefFoundError and general Exception to aid debugging - Reword InMemoryMetricReader comment for precision Improve error handling observability - Add debug log to FlagEvalMetrics.record() catch block so metric recording failures are visible in debug logs - Widen Provider catch from NoClassDefFoundError to LinkageError to cover IncompatibleClassChangeError and other classloader issues from incompatible OTel SDK versions - Add slf4j logger to Provider and log at error level when the fallback catch fires Use warn level for Provider fallback catch The Provider catch is defense-in-depth for when FlagEvalMetrics class itself can't load (OTel API absent entirely). The detailed error message is logged inside FlagEvalMetrics when it CAN load but SDK init fails. Using error level here caused the openfeature smoke test to fail (it asserts no ERROR entries in application logs). Remove evaluationLogging option — metrics always enabled Evaluation metrics are always attempted. If the OTel SDK is absent, the provider degrades gracefully with a warning. There is no user- facing toggle to disable metrics — this matches the Go and Python SDKs which also always attempt metrics. Co-authored-by: devflow.devflow-routing-intake <devflow.devflow-routing-intake@kubernetes.us1.ddbuild.io>
1 parent f064e18 commit 81bb305

8 files changed

Lines changed: 661 additions & 0 deletions

File tree

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# dd-openfeature
2+
3+
Datadog OpenFeature Provider for Java. Implements the [OpenFeature](https://openfeature.dev/) `FeatureProvider` interface for Datadog's Feature Flags and Experimentation (FFE) product.
4+
5+
Published as `com.datadoghq:dd-openfeature` on Maven Central.
6+
7+
## Setup
8+
9+
```xml
10+
<dependency>
11+
<groupId>com.datadoghq</groupId>
12+
<artifactId>dd-openfeature</artifactId>
13+
<version>${dd-openfeature.version}</version>
14+
</dependency>
15+
```
16+
17+
The OpenFeature SDK (`dev.openfeature:sdk`) is included as a transitive dependency.
18+
19+
### Evaluation metrics (optional)
20+
21+
To enable evaluation metrics (`feature_flag.evaluations` counter), add the OpenTelemetry SDK dependencies:
22+
23+
```xml
24+
<dependency>
25+
<groupId>io.opentelemetry</groupId>
26+
<artifactId>opentelemetry-sdk-metrics</artifactId>
27+
<version>1.47.0</version>
28+
</dependency>
29+
<dependency>
30+
<groupId>io.opentelemetry</groupId>
31+
<artifactId>opentelemetry-exporter-otlp</artifactId>
32+
<version>1.47.0</version>
33+
</dependency>
34+
```
35+
36+
Any OpenTelemetry API 1.x version is compatible. If these dependencies are absent, the provider operates normally without metrics.
37+
38+
## Usage
39+
40+
```java
41+
import datadog.trace.api.openfeature.Provider;
42+
import dev.openfeature.sdk.OpenFeatureAPI;
43+
import dev.openfeature.sdk.Client;
44+
45+
OpenFeatureAPI api = OpenFeatureAPI.getInstance();
46+
api.setProviderAndWait(new Provider());
47+
Client client = api.getClient();
48+
49+
boolean enabled = client.getBooleanValue("my-feature", false,
50+
new MutableContext("user-123"));
51+
```
52+
53+
## Evaluation metrics
54+
55+
When the OTel SDK dependencies are on the classpath, the provider records a `feature_flag.evaluations` counter via OTLP HTTP/protobuf. Metrics are exported every 10 seconds to the Datadog Agent's OTLP receiver.
56+
57+
### Configuration
58+
59+
| Environment variable | Description | Default |
60+
|---|---|---|
61+
| `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT` | Signal-specific OTLP endpoint (used as-is) ||
62+
| `OTEL_EXPORTER_OTLP_ENDPOINT` | Generic OTLP endpoint (`/v1/metrics` appended) ||
63+
| (none set) | Default endpoint | `http://localhost:4318/v1/metrics` |
64+
65+
### Metric attributes
66+
67+
| Attribute | Description |
68+
|---|---|
69+
| `feature_flag.key` | Flag key |
70+
| `feature_flag.result.variant` | Resolved variant key |
71+
| `feature_flag.result.reason` | Evaluation reason (lowercased) |
72+
| `error.type` | Error code (lowercased, only on error) |
73+
| `feature_flag.result.allocation_key` | Allocation key (when present) |
74+
75+
## Requirements
76+
77+
- Java 11+
78+
- Datadog Agent with Remote Configuration enabled
79+
- `DD_EXPERIMENTAL_FLAGGING_PROVIDER_ENABLED=true`

products/feature-flagging/feature-flagging-api/build.gradle.kts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,19 @@ dependencies {
4444
api("dev.openfeature:sdk:1.20.1")
4545

4646
compileOnly(project(":products:feature-flagging:feature-flagging-bootstrap"))
47+
compileOnly(project(":utils:config-utils"))
48+
compileOnly("io.opentelemetry:opentelemetry-api:1.47.0")
49+
compileOnly("io.opentelemetry:opentelemetry-sdk-metrics:1.47.0")
50+
compileOnly("io.opentelemetry:opentelemetry-exporter-otlp:1.47.0")
4751

4852
testImplementation(project(":products:feature-flagging:feature-flagging-bootstrap"))
53+
testImplementation("io.opentelemetry:opentelemetry-api:1.47.0")
54+
testImplementation("io.opentelemetry:opentelemetry-sdk-metrics:1.47.0")
55+
testImplementation("io.opentelemetry:opentelemetry-exporter-otlp:1.47.0")
4956
testImplementation(libs.bundles.junit5)
5057
testImplementation(libs.bundles.mockito)
5158
testImplementation(libs.moshi)
59+
testImplementation("io.opentelemetry:opentelemetry-sdk-testing:1.47.0")
5260
testImplementation("org.awaitility:awaitility:4.3.0")
5361
}
5462

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
package datadog.trace.api.openfeature;
2+
3+
import dev.openfeature.sdk.ErrorCode;
4+
import dev.openfeature.sdk.FlagEvaluationDetails;
5+
import dev.openfeature.sdk.Hook;
6+
import dev.openfeature.sdk.HookContext;
7+
import dev.openfeature.sdk.ImmutableMetadata;
8+
import java.util.Map;
9+
10+
class FlagEvalHook implements Hook<Object> {
11+
12+
private final FlagEvalMetrics metrics;
13+
14+
FlagEvalHook(FlagEvalMetrics metrics) {
15+
this.metrics = metrics;
16+
}
17+
18+
@Override
19+
public void finallyAfter(
20+
HookContext<Object> ctx, FlagEvaluationDetails<Object> details, Map<String, Object> hints) {
21+
if (metrics == null || details == null) {
22+
return;
23+
}
24+
try {
25+
String flagKey = details.getFlagKey();
26+
String variant = details.getVariant();
27+
String reason = details.getReason();
28+
ErrorCode errorCode = details.getErrorCode();
29+
30+
String allocationKey = null;
31+
ImmutableMetadata metadata = details.getFlagMetadata();
32+
if (metadata != null) {
33+
allocationKey = metadata.getString("allocationKey");
34+
}
35+
36+
metrics.record(flagKey, variant, reason, errorCode, allocationKey);
37+
} catch (Exception e) {
38+
// Never let metrics recording break flag evaluation
39+
}
40+
}
41+
}
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
package datadog.trace.api.openfeature;
2+
3+
import datadog.trace.config.inversion.ConfigHelper;
4+
import dev.openfeature.sdk.ErrorCode;
5+
import io.opentelemetry.api.common.AttributeKey;
6+
import io.opentelemetry.api.common.Attributes;
7+
import io.opentelemetry.api.common.AttributesBuilder;
8+
import io.opentelemetry.api.metrics.LongCounter;
9+
import io.opentelemetry.api.metrics.Meter;
10+
import io.opentelemetry.exporter.otlp.http.metrics.OtlpHttpMetricExporter;
11+
import io.opentelemetry.sdk.metrics.SdkMeterProvider;
12+
import io.opentelemetry.sdk.metrics.export.AggregationTemporalitySelector;
13+
import io.opentelemetry.sdk.metrics.export.PeriodicMetricReader;
14+
import java.io.Closeable;
15+
import java.time.Duration;
16+
import org.slf4j.Logger;
17+
import org.slf4j.LoggerFactory;
18+
19+
class FlagEvalMetrics implements Closeable {
20+
21+
private static final Logger log = LoggerFactory.getLogger(FlagEvalMetrics.class);
22+
23+
private static final String METER_NAME = "ddtrace.openfeature";
24+
private static final String METRIC_NAME = "feature_flag.evaluations";
25+
private static final String METRIC_UNIT = "{evaluation}";
26+
private static final String METRIC_DESC = "Number of feature flag evaluations";
27+
private static final Duration EXPORT_INTERVAL = Duration.ofSeconds(10);
28+
29+
private static final String DEFAULT_ENDPOINT = "http://localhost:4318/v1/metrics";
30+
// Signal-specific env var (used as-is, must include /v1/metrics path)
31+
private static final String ENDPOINT_ENV = "OTEL_EXPORTER_OTLP_METRICS_ENDPOINT";
32+
// Generic env var fallback (base URL, /v1/metrics is appended)
33+
private static final String ENDPOINT_GENERIC_ENV = "OTEL_EXPORTER_OTLP_ENDPOINT";
34+
35+
private static final AttributeKey<String> ATTR_FLAG_KEY =
36+
AttributeKey.stringKey("feature_flag.key");
37+
private static final AttributeKey<String> ATTR_VARIANT =
38+
AttributeKey.stringKey("feature_flag.result.variant");
39+
private static final AttributeKey<String> ATTR_REASON =
40+
AttributeKey.stringKey("feature_flag.result.reason");
41+
private static final AttributeKey<String> ATTR_ERROR_TYPE = AttributeKey.stringKey("error.type");
42+
private static final AttributeKey<String> ATTR_ALLOCATION_KEY =
43+
AttributeKey.stringKey("feature_flag.result.allocation_key");
44+
45+
private volatile LongCounter counter;
46+
// Typed as Closeable to avoid loading SdkMeterProvider at class-load time
47+
// when the OTel SDK is absent from the classpath
48+
private volatile Closeable meterProvider;
49+
50+
FlagEvalMetrics() {
51+
try {
52+
String endpoint = ConfigHelper.env(ENDPOINT_ENV);
53+
if (endpoint == null || endpoint.isEmpty()) {
54+
String base = ConfigHelper.env(ENDPOINT_GENERIC_ENV);
55+
if (base != null && !base.isEmpty()) {
56+
endpoint = base.endsWith("/") ? base + "v1/metrics" : base + "/v1/metrics";
57+
} else {
58+
endpoint = DEFAULT_ENDPOINT;
59+
}
60+
}
61+
62+
OtlpHttpMetricExporter exporter =
63+
OtlpHttpMetricExporter.builder()
64+
.setEndpoint(endpoint)
65+
.setAggregationTemporalitySelector(AggregationTemporalitySelector.alwaysCumulative())
66+
.build();
67+
68+
PeriodicMetricReader reader =
69+
PeriodicMetricReader.builder(exporter).setInterval(EXPORT_INTERVAL).build();
70+
71+
SdkMeterProvider sdkMeterProvider =
72+
SdkMeterProvider.builder().registerMetricReader(reader).build();
73+
meterProvider = sdkMeterProvider;
74+
75+
Meter meter = sdkMeterProvider.meterBuilder(METER_NAME).build();
76+
counter =
77+
meter
78+
.counterBuilder(METRIC_NAME)
79+
.setUnit(METRIC_UNIT)
80+
.setDescription(METRIC_DESC)
81+
.build();
82+
83+
log.debug("Flag evaluation metrics initialized, exporting to {}", endpoint);
84+
} catch (NoClassDefFoundError e) {
85+
log.error(
86+
"OpenTelemetry SDK is not on the classpath — evaluation metrics disabled. "
87+
+ "Add opentelemetry-sdk-metrics and opentelemetry-exporter-otlp to your dependencies "
88+
+ "to enable flag evaluation metrics.",
89+
e);
90+
counter = null;
91+
meterProvider = null;
92+
} catch (Exception e) {
93+
log.error("Failed to initialize flag evaluation metrics", e);
94+
counter = null;
95+
meterProvider = null;
96+
}
97+
}
98+
99+
/** Package-private constructor for testing with a mock counter. */
100+
FlagEvalMetrics(LongCounter counter) {
101+
this.counter = counter;
102+
this.meterProvider = null;
103+
}
104+
105+
/** Package-private constructor for integration testing with an injected SdkMeterProvider. */
106+
FlagEvalMetrics(SdkMeterProvider sdkMeterProvider) {
107+
meterProvider = sdkMeterProvider;
108+
Meter meter = sdkMeterProvider.meterBuilder(METER_NAME).build();
109+
counter =
110+
meter.counterBuilder(METRIC_NAME).setUnit(METRIC_UNIT).setDescription(METRIC_DESC).build();
111+
}
112+
113+
void record(
114+
String flagKey, String variant, String reason, ErrorCode errorCode, String allocationKey) {
115+
LongCounter c = counter;
116+
if (c == null) {
117+
return;
118+
}
119+
try {
120+
AttributesBuilder builder =
121+
Attributes.builder()
122+
.put(ATTR_FLAG_KEY, flagKey)
123+
.put(ATTR_VARIANT, variant != null ? variant : "")
124+
.put(ATTR_REASON, reason != null ? reason.toLowerCase() : "unknown");
125+
126+
if (errorCode != null) {
127+
builder.put(ATTR_ERROR_TYPE, errorCode.name().toLowerCase());
128+
}
129+
130+
if (allocationKey != null && !allocationKey.isEmpty()) {
131+
builder.put(ATTR_ALLOCATION_KEY, allocationKey);
132+
}
133+
134+
c.add(1, builder.build());
135+
} catch (Exception e) {
136+
log.debug("Failed to record flag evaluation metric for {}", flagKey, e);
137+
}
138+
}
139+
140+
@Override
141+
public void close() {
142+
shutdown();
143+
}
144+
145+
void shutdown() {
146+
counter = null;
147+
Closeable mp = meterProvider;
148+
if (mp != null) {
149+
meterProvider = null;
150+
try {
151+
mp.close();
152+
} catch (Exception e) {
153+
// Ignore shutdown errors
154+
}
155+
}
156+
}
157+
}

products/feature-flagging/feature-flagging-api/src/main/java/datadog/trace/api/openfeature/Provider.java

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import de.thetaphi.forbiddenapis.SuppressForbidden;
66
import dev.openfeature.sdk.EvaluationContext;
77
import dev.openfeature.sdk.EventProvider;
8+
import dev.openfeature.sdk.Hook;
89
import dev.openfeature.sdk.Metadata;
910
import dev.openfeature.sdk.ProviderEvaluation;
1011
import dev.openfeature.sdk.ProviderEvent;
@@ -14,17 +15,24 @@
1415
import dev.openfeature.sdk.exceptions.OpenFeatureError;
1516
import dev.openfeature.sdk.exceptions.ProviderNotReadyError;
1617
import java.lang.reflect.Constructor;
18+
import java.util.Collections;
19+
import java.util.List;
1720
import java.util.concurrent.TimeUnit;
1821
import java.util.concurrent.atomic.AtomicBoolean;
22+
import org.slf4j.Logger;
23+
import org.slf4j.LoggerFactory;
1924

2025
public class Provider extends EventProvider implements Metadata {
2126

27+
private static final Logger log = LoggerFactory.getLogger(Provider.class);
2228
static final String METADATA = "datadog-openfeature-provider";
2329
private static final String EVALUATOR_IMPL = "datadog.trace.api.openfeature.DDEvaluator";
2430
private static final Options DEFAULT_OPTIONS = new Options().initTimeout(30, SECONDS);
2531
private volatile Evaluator evaluator;
2632
private final Options options;
2733
private final AtomicBoolean initialized = new AtomicBoolean(false);
34+
private final FlagEvalMetrics flagEvalMetrics;
35+
private final FlagEvalHook flagEvalHook;
2836

2937
public Provider() {
3038
this(DEFAULT_OPTIONS, null);
@@ -37,6 +45,18 @@ public Provider(final Options options) {
3745
Provider(final Options options, final Evaluator evaluator) {
3846
this.options = options;
3947
this.evaluator = evaluator;
48+
FlagEvalMetrics metrics = null;
49+
FlagEvalHook hook = null;
50+
try {
51+
metrics = new FlagEvalMetrics();
52+
hook = new FlagEvalHook(metrics);
53+
} catch (LinkageError | Exception e) {
54+
// FlagEvalMetrics logs the detailed error when it can load but OTel SDK init fails.
55+
// This outer catch fires when the class itself can't load (OTel API absent entirely).
56+
log.warn("Evaluation metrics unavailable — OTel classes not on classpath", e);
57+
}
58+
this.flagEvalMetrics = metrics;
59+
this.flagEvalHook = hook;
4060
}
4161

4262
@Override
@@ -77,8 +97,19 @@ private Evaluator buildEvaluator() throws Exception {
7797
return (Evaluator) ctor.newInstance((Runnable) this::onConfigurationChange);
7898
}
7999

100+
@Override
101+
public List<Hook> getProviderHooks() {
102+
if (flagEvalHook == null) {
103+
return Collections.emptyList();
104+
}
105+
return Collections.singletonList(flagEvalHook);
106+
}
107+
80108
@Override
81109
public void shutdown() {
110+
if (flagEvalMetrics != null) {
111+
flagEvalMetrics.shutdown();
112+
}
82113
if (evaluator != null) {
83114
evaluator.shutdown();
84115
}

0 commit comments

Comments
 (0)