Skip to content

test: disable Azure CLI telemetry in integration test env#363

Open
gtsiolis wants to merge 1 commit into
mainfrom
pro-362-windows-ci-flake-az-telemetry-uploader-breaks-ttempdir
Open

test: disable Azure CLI telemetry in integration test env#363
gtsiolis wants to merge 1 commit into
mainfrom
pro-362-windows-ci-flake-az-telemetry-uploader-breaks-ttempdir

Conversation

@gtsiolis

@gtsiolis gtsiolis commented Jul 3, 2026

Copy link
Copy Markdown
Member

Problem

main is red on the Windows integration-test job. TestAzStopInterceptionNoOpWhenNotIntercepting passes its assertions but then fails during t.TempDir() auto-cleanup:

testing.go:1464: TempDir RemoveAll cleanup: unlinkat
C:\Users\RUNNER~1\AppData\Local\Temp\TestAzStopInterceptionNoOpWhenNotIntercepting3716188435\001:
The process cannot access the file because it is being used by another process.

(Introduced by #357 — failing run: https://github.com/localstack/lstk/actions/runs/28601912948)

Root cause

The test runs real az commands (via lstk az stop-interception and a raw az cloud show). When Azure CLI telemetry is enabled, az spawns a detached background telemetry-upload process that outlives the az invocation and keeps an open handle on the test's temp working directory (t.TempDir(), dir \001). When Go's t.TempDir() cleanup runs RemoveAll while that uploader is still alive, Windows refuses to delete a directory another process holds open. Unix removes directories with open handles, which is why only the Windows job flakes.

The codebase already disables this telemetry (core.collect_telemetry=false) for the isolated Azure config dir in setLstkPrefs, but stop-interception deliberately targets the global ~/.azure and doesn't touch telemetry prefs — so under test it runs with telemetry on.

Fix

Default AZURE_CORE_COLLECT_TELEMETRY=false in every test environment (env.With / env.Without), mirroring the existing UnreachableAnalyticsEndpoint guard that already keeps tests off our own analytics backend. It propagates to every az invocation (through runLstkaz and runAzRaw), so no uploader spawns → no lingering handle → cleanup succeeds. Explicit per-test overrides still win (exec dedups to the last value).

Tests

Added two unit tests on the env helper, matching the existing analytics-endpoint pair:

  • TestBaseEnvDisablesAzureTelemetry — the default is set by both constructors.
  • TestExplicitAzureTelemetryOverridesDefault — an explicit value wins.

Verification note

The failure is Windows-only (RemoveAll with an open handle is a no-op on Unix), so it can't be reproduced on macOS/Linux dev machines or the Linux CI shards. The new unit tests guard the mechanism (telemetry disabled in the base env); the Windows CI job on this PR is the end-to-end confirmation.

Fixes PRO-362.

The Windows integration-test job fails in TestAzStopInterceptionNoOpWhenNotIntercepting during t.TempDir() auto-cleanup:

  testing.go:1464: TempDir RemoveAll cleanup: unlinkat ...\001:
  The process cannot access the file because it is being used by another process.

The test assertions pass; the failure is in cleanup. The test runs real az commands, and with telemetry enabled az spawns a detached background telemetry-upload process that outlives the invocation and keeps an open handle on the test's temp working dir. Windows refuses to RemoveAll a directory another process holds open (Unix allows it, so only the Windows job flakes).

Default AZURE_CORE_COLLECT_TELEMETRY=false in every test env (env.With/env.Without), mirroring the existing UnreachableAnalyticsEndpoint guard, so no uploader spawns and cleanup succeeds. Explicit per-test overrides still win.

Fixes PRO-362.
@gtsiolis gtsiolis requested a review from a team as a code owner July 3, 2026 14:57
@gtsiolis gtsiolis self-assigned this Jul 3, 2026
@gtsiolis gtsiolis added semver: patch docs: skip Pull request does not require documentation changes labels Jul 3, 2026

@anisaoshafi anisaoshafi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great finding on AZURE_CORE_COLLECT_TELEMETRY, thanks for fixing the flaky windows test 🥇

Comment on lines +30 to +37
// DisableAzureTelemetry is the value used to turn the Azure CLI's telemetry off in
// every test environment. When telemetry is on, `az` spawns a detached background
// uploader process that outlives the `az` invocation and keeps a handle on its
// working directory / HOME — which for tests is a t.TempDir(). On Windows that open
// handle makes t.TempDir()'s RemoveAll cleanup fail ("being used by another process").
// Disabling it here prevents that flake and stops tests emitting to Azure's telemetry
// backend, mirroring UnreachableAnalyticsEndpoint for our own analytics.
const DisableAzureTelemetry = "false"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you simplify this explanation?
And potentially remove this variable - I don't see why it's needed when we can use false directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs: skip Pull request does not require documentation changes semver: patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants