Skip to content

[SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll#2468

Open
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:FederatedWorkerReadyPolling
Open

[SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll#2468
Baunsgaard wants to merge 1 commit into
apache:mainfrom
Baunsgaard:FederatedWorkerReadyPolling

Conversation

@Baunsgaard
Copy link
Copy Markdown
Contributor

@Baunsgaard Baunsgaard commented May 15, 2026

WIP

This PR replace the thread.sleep with a poll based startup of federated workers in testing. The change helps our test suites to not have timeouts, or failures because of inconsistent launches of federated workers.

@github-project-automation github-project-automation Bot moved this to In Progress in SystemDS PR Queue May 15, 2026
@Baunsgaard Baunsgaard changed the title [SYSTEMDS-2651][] Replace fixed-sleep federated worker startup with Poll [SYSTEMDS-2651] Replace fixed-sleep federated worker startup with Poll May 15, 2026
…rtup

Replace fixed Thread.sleep after each federated worker start with TCP
port polling that returns as soon as the worker accepts a connection.
Add bulk helpers that spawn N workers in parallel and wait once for the
slowest to become ready, instead of summing per-worker waits.

Cuts the federated CI total by ~7 min (-5%) vs main, with the biggest
wins in setup-heavy suites such as transform+fedplanner (-66%) and
codegen (-25%).

Closes apache#2468.
@Baunsgaard Baunsgaard force-pushed the FederatedWorkerReadyPolling branch from 8804921 to 0c830d4 Compare May 18, 2026 16:01
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.38%. Comparing base (3f7b17b) to head (0c830d4).
⚠️ Report is 49 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2468      +/-   ##
============================================
- Coverage     71.55%   71.38%   -0.17%     
- Complexity    47461    48707    +1246     
============================================
  Files          1539     1570      +31     
  Lines        182631   188757    +6126     
  Branches      35919    37039    +1120     
============================================
+ Hits         130677   134744    +4067     
- Misses        41944    43571    +1627     
- Partials      10010    10442     +432     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant