Skip to content

feat(ci): monitors marker + three-way Better Stack heartbeat split#641

Draft
ari-nz wants to merge 1 commit into
mainfrom
feat/monitors-marker-split
Draft

feat(ci): monitors marker + three-way Better Stack heartbeat split#641
ari-nz wants to merge 1 commit into
mainfrom
feat/monitors-marker-split

Conversation

@ari-nz
Copy link
Copy Markdown
Collaborator

@ari-nz ari-nz commented May 12, 2026

Summary

  • Replaces the parameterised monitors(\"name\") marker with two boolean markers that pytest `-m` expressions can filter on directly:
    • platform_api — test monitors Platform API layer (auth, listing, connectivity)
    • platform_applications — test monitors a platform application (he-tme, test-app)
    • Tests with neither marker are SDK-layer health checks
  • Splits the hourly scheduled workflow into three independent runs, each feeding its own Better Stack heartbeat:
    • SDK (`not platform_api and not platform_applications`): token management, service wiring
    • Platform API (`platform_api`): health check, application listing, run listing
    • Platform Applications (`platform_applications`): he-tme and test-app processing tests
  • Extracts heartbeat send logic into `.github/workflows/_betterstack_heartbeat.py` (stdlib urllib, no jq dependency)
  • Heartbeat steps never fail the CI job (errors are logged as warnings)
  • `if: always()` added to the final "Fail job if any tests failed" step
  • JUnit artifact upload uses `reports/junit_*.xml` wildcard (nox controls filenames)
  • New optional secrets: `BETTERSTACK_HEARTBEAT_URL_PLATFORM_API_{STAGING|PRODUCTION}`, `BETTERSTACK_HEARTBEAT_URL_PLATFORM_APPLICATIONS_{STAGING|PRODUCTION}`

Action required

Rename GitHub secret (repository settings → Secrets):

  • BETTERSTACK_HEARTBEAT_URL_HE_TME_STAGINGBETTERSTACK_HEARTBEAT_URL_PLATFORM_APPLICATIONS_STAGING
  • BETTERSTACK_HEARTBEAT_URL_HE_TME_PRODUCTIONBETTERSTACK_HEARTBEAT_URL_PLATFORM_APPLICATIONS_PRODUCTION

Until the secret is renamed the Platform Applications heartbeat will silently skip (step logs a warning).

Test plan

  • `uv run pytest --collect-only -m "platform_api"` → 3 tests
  • `uv run pytest --collect-only -m "platform_applications"` → 6 tests
  • `uv run pytest --collect-only -m "not platform_api and not platform_applications" -m "scheduled or scheduled_only"` → SDK-only scheduled tests
  • Better Stack heartbeat steps each skip gracefully when URL secrets are absent

Copilot AI review requested due to automatic review settings May 12, 2026 08:36
@ari-nz ari-nz added the skip:test:long_running Skip long-running tests (≥5min) label May 12, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a monitors(...) pytest marker intended to tag scheduled E2E tests by the Better Stack monitor they should feed, and updates the hourly scheduled CI workflow to split test execution and emit separate heartbeats (SDK vs. application-specific).

Changes:

  • Added @pytest.mark.monitors("he-tme" | "test-app") to selected platform E2E scheduled tests.
  • Registered the new monitors marker in pyproject.toml pytest configuration.
  • Split the hourly scheduled workflow into multiple pytest invocations and added a dedicated Better Stack heartbeat URL secret for HE-TME (propagated via the staging/production wrapper workflows).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/aignostics/platform/e2e_test.py Tags scheduled E2E tests with monitors(...) to support routing results/heartbeats by monitored system.
pyproject.toml Registers the new pytest marker so --strict-markers runs remain valid.
.github/workflows/_scheduled-test-hourly.yml Splits scheduled tests into separate runs and sends separate Better Stack heartbeats (SDK + HE-TME), plus combined status to Sentry.
.github/workflows/scheduled-testing-staging-hourly.yml Passes through the new HE-TME Better Stack heartbeat secret to the reusable workflow.
.github/workflows/scheduled-testing-production-hourly.yml Passes through the new HE-TME Better Stack heartbeat secret to the reusable workflow.

Comment on lines +123 to +128
# set +e so a test failure does not abort the step — we capture the exit code
# manually and send it to Better Stack regardless of outcome.
set +e
make test_scheduled
EXIT_CODE=$?
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and monitors and not stress_only" \
--junit-xml=reports/junit_he_tme.xml
Comment on lines +117 to +129
- name: Test / scheduled / he-tme
id: test_he_tme
env:
BETTERSTACK_HEARTBEAT_URL: "${{ inputs.platform_environment == 'staging' && secrets.BETTERSTACK_HEARTBEAT_URL_STAGING || secrets.BETTERSTACK_HEARTBEAT_URL_PRODUCTION }}"
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
shell: bash
run: |
# set +e so a test failure does not abort the step — we capture the exit code
# manually and send it to Better Stack regardless of outcome.
set +e
make test_scheduled
EXIT_CODE=$?
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and monitors and not stress_only" \
--junit-xml=reports/junit_he_tme.xml
echo "exit_code=$?" >> $GITHUB_OUTPUT
Comment thread pyproject.toml Outdated
"unit: Solitary unit tests - test a layer of a module in isolation with all dependencies mocked, except interaction with shared utils and the systems module. Unit tests must be able to pass offline, i.e. not calls to external services. The timeout should not be bigger than the default 10s, and must be <5 min.",
"integration: Sociable integration tests - test interactions across architectural layers (e.g. CLI/GUI→Service, Service→Utils) or between modules (e.g. Application→Platform), using real SDK collaborators, real file I/O, real subprocesses, and real Docker containers. Integration test must be able to pass offline, i.e. mock external services (Aignostics Platform API, Auth0, S3/GCS buckets, IDC). The timeout should not be bigger than the default 10s, and must be <5 min.",
"e2e: End-to-end tests - test complete workflows with real external network services (Aignostics Platform API, cloud storage, IDC, etc). If the test timeout is >= 5 min and < 60 min, additionally mark as `long_running`, if >= 60min mark as 'very_long_running'.",
"monitors: Tag a scheduled test with the application it monitors, e.g. @pytest.mark.monitors('he-tme'). Tests without this marker are considered SDK-layer health checks. Used to route Better Stack heartbeats to the correct monitor.",
Comment on lines +288 to +289
reports/junit_sdk.xml
reports/junit_he_tme.xml
@ari-nz ari-nz changed the title feat(ci): split hourly heartbeat by concern and add monitors marker feat(ci): monitors marker + three-way Better Stack heartbeat split May 12, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.
see 2 files with indirect coverage changes

@ari-nz ari-nz force-pushed the feat/monitors-marker-split branch from 129c9aa to 0683245 Compare May 12, 2026 13:41
@ari-nz ari-nz removed the skip:test:long_running Skip long-running tests (≥5min) label May 19, 2026
@ari-nz ari-nz marked this pull request as draft May 19, 2026 11:07
Copilot AI review requested due to automatic review settings May 19, 2026 11:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Comment on lines +139 to +150
- name: Test / scheduled / he-tme
id: test_he_tme
env:
SENTRY_DSN: ${{ secrets.SENTRY_DSN }}
shell: bash
run: |
# set +e so a test failure does not abort the step — we capture the exit code
# manually and send it to Better Stack regardless of outcome.
set +e
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and monitors and not monitors_platform_api and not stress_only" \
--junit-xml=reports/junit_he_tme.xml
--request POST \
--header "Content-Type: application/json" \
--data-binary "${BETTERSTACK_METADATA_PAYLOAD}" \
"${BETTERSTACK_HEARTBEAT_URL}/${SDK_EXIT}"
Comment on lines +297 to +304
curl \
--fail-with-body \
--silent \
--request POST \
--header "Content-Type: application/json" \
--data-binary "${BETTERSTACK_METADATA_PAYLOAD}" \
"${BETTERSTACK_HEARTBEAT_URL_PLATFORM_API}/${PLATFORM_API_EXIT}"
echo "INFO: Sent Platform API heartbeat to BetterStack with exit code '${PLATFORM_API_EXIT}'"
Comment on lines +347 to +354
curl \
--fail-with-body \
--silent \
--request POST \
--header "Content-Type: application/json" \
--data-binary "${BETTERSTACK_METADATA_PAYLOAD}" \
"${BETTERSTACK_HEARTBEAT_URL_HE_TME}/${HE_TME_EXIT}"
echo "INFO: Sent HE-TME heartbeat to BetterStack with exit code '${HE_TME_EXIT}'"
retention-days: 7

- name: Fail job if any tests failed
shell: bash
Comment on lines +120 to +123
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and not monitors and not monitors_platform_api and not stress_only" \
--junit-xml=reports/junit_sdk.xml
echo "exit_code=$?" >> $GITHUB_OUTPUT
Comment on lines +134 to +137
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and monitors_platform_api and not stress_only" \
--junit-xml=reports/junit_platform_api.xml
echo "exit_code=$?" >> $GITHUB_OUTPUT
Comment on lines +148 to +151
XDIST_WORKER_FACTOR=1 uv run --all-extras nox -s test -- \
-m "(scheduled or scheduled_only) and monitors and not monitors_platform_api and not stress_only" \
--junit-xml=reports/junit_he_tme.xml
echo "exit_code=$?" >> $GITHUB_OUTPUT
Comment on lines +362 to +364
reports/junit_sdk.xml
reports/junit_platform_api.xml
reports/junit_he_tme.xml
@ari-nz ari-nz force-pushed the feat/monitors-marker-split branch from eaa4b76 to 15c7e5d Compare May 19, 2026 14:18
Copilot AI review requested due to automatic review settings May 19, 2026 17:29
@ari-nz ari-nz force-pushed the feat/monitors-marker-split branch from 15c7e5d to 0fdfed4 Compare May 19, 2026 17:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread .github/CLAUDE.md
Comment thread .github/CLAUDE.md Outdated
@ari-nz ari-nz force-pushed the feat/monitors-marker-split branch from 0fdfed4 to 39f8b21 Compare May 19, 2026 17:56
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
E Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants