Skip to content

Support multi-layer algorithms that receive a data product from an unfold#571

Open
knoepfel wants to merge 4 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure
Open

Support multi-layer algorithms that receive a data product from an unfold#571
knoepfel wants to merge 4 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure

Conversation

@knoepfel
Copy link
Copy Markdown
Member

@knoepfel knoepfel commented May 6, 2026

Summary

This PR fixes a bug where transforms operating on a hierarchy layer downstream of an unfold could not receive data products produced by that unfold. The root cause was that the index_router had no mechanism to learn — from the unfold itself — how many children it had generated, so it could not emit the correct flush token at the right time.

Solution

The fix introduces two new components and refactors the flush/routing pipeline.

New: data_cell_tracker (phlex/model/)

Tracks the sequence of incoming data_cell_index values from the driver and determines which flush tokens are ready to be emitted when a new index arrives or the job ends. This logic was previously implicit in the source/index-router interaction.

New: flush_counter (phlex/core/)

Per-index counter that tracks how many children have been processed across each child layer. Once the expected count (received from the unfold's flush result) is satisfied, flush_counter fires a callback to emit the flush token for that index.

declared_unfold

Now uses a third TBB output port (index_flush) carrying the child counts alongside the existing message and index-message ports.

index_router

Gains a flush_receiver input and an establish_layers() initializer that records which layers are produced by unfolds vs. consumed as inputs. route() now accepts the closeout flushes from data_cell_tracker rather than computing them internally, and drain() likewise receives the remaining flushes at job end.

framework_graph

  • Introduces a index_receiver_ node that decouples closeout flush emission from the source node.
  • Instantiates a data_cell_tracker (cell_tracker_) and feeds its output into index_router_.route().
  • Calls index_router_.establish_layers() before finalize(), using layer metadata gathered from the declared unfolds.
  • Connects each unfold's flush_sender() to index_router_.flush_receiver().

edge_maker

Refactored to return (provider_input_ports, multilayer_join_index_ports) instead of directly calling index_router_.finalize(), allowing framework_graph to call establish_layers() first.

Tests

  • test/data_cell_tracker_test.cpp — unit tests for data_cell_tracker in isolation.
  • test/flush_counter_propagation_test.cpp — unit-level tests for flush counter propagation using a standalone index_counter node, verifying that committed child counts accumulate correctly through nested unfolds without involving the full framework_graph machinery.
  • test/fold_duplicate_layer_name_test.cpp — re-enables the previously disabled different_hierarchies test, now covering the case where two layers share the same name in a fold scenario.
  • test/unfold.cpp — additional unfold scenarios exercising multi-layer product consumption.

Files changed

Area Files
New model phlex/model/data_cell_tracker.hpp, phlex/model/data_cell_tracker.cpp
New core phlex/core/flush_counter.hpp, phlex/core/flush_counter.cpp
Modified core phlex/core/index_router.hpp, phlex/core/index_router.cpp, phlex/core/framework_graph.hpp, phlex/core/framework_graph.cpp, phlex/core/declared_unfold.hpp, phlex/core/edge_maker.hpp, phlex/model/fixed_hierarchy.hpp, phlex/model/fixed_hierarchy.cpp
New tests test/data_cell_tracker_test.cpp, test/flush_counter_propagation_test.cpp, test/fold_duplicate_layer_name_test.cpp
Modified tests test/unfold.cpp

Summary largely courtesy of Claude Sonnet 4.6

Resolves #359

@knoepfel knoepfel force-pushed the multi-layer-failure branch from 381d7c1 to fa25cb1 Compare May 6, 2026 21:48
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 94.11765% with 18 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
phlex/core/index_router.cpp 92.71% 6 Missing and 5 partials ⚠️
phlex/model/data_cell_tracker.cpp 91.11% 0 Missing and 4 partials ⚠️
phlex/model/child_tracker.cpp 93.33% 1 Missing and 1 partial ⚠️
phlex/core/framework_graph.cpp 97.87% 0 Missing and 1 partial ⚠️
@@            Coverage Diff             @@
##             main     #571      +/-   ##
==========================================
- Coverage   85.77%   82.54%   -3.24%     
==========================================
  Files         156      161       +5     
  Lines        5372     5890     +518     
  Branches      646      686      +40     
==========================================
+ Hits         4608     4862     +254     
- Misses        554      810     +256     
- Partials      210      218       +8     
Flag Coverage Δ
scripts 76.12% <ø> (-10.26%) ⬇️
unittests 82.54% <94.11%> (-3.24%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
phlex/core/declared_unfold.cpp 100.00% <100.00%> (+16.66%) ⬆️
phlex/core/declared_unfold.hpp 96.36% <100.00%> (-0.07%) ⬇️
phlex/core/edge_maker.hpp 100.00% <100.00%> (ø)
phlex/core/framework_graph.hpp 100.00% <ø> (ø)
phlex/core/index_router.hpp 100.00% <100.00%> (ø)
phlex/model/child_tracker.hpp 100.00% <100.00%> (ø)
phlex/model/data_cell_tracker.hpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.cpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.hpp 100.00% <100.00%> (ø)
phlex/core/framework_graph.cpp 91.96% <97.87%> (+0.21%) ⬆️
... and 3 more

... and 6 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 66a32e3...f4599e5. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@knoepfel knoepfel force-pushed the multi-layer-failure branch 11 times, most recently from 2045537 to eb2cf70 Compare May 12, 2026 20:40
@knoepfel knoepfel force-pushed the multi-layer-failure branch from eb2cf70 to f4599e5 Compare May 12, 2026 20:49
@Framework-R-D Framework-R-D deleted a comment from greenc-FNAL May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change fold-result caching to use multilayer_join_node

1 participant