Pre-filing checklist
Component(s)
Rust OTAP dataflow (rust/otap-dataflow/)
Bug Description
The temporal reaggregation processor accept_pdata implementation gates on having one inbound and two outbound slots available, the maximum required to accommodate an incoming pdata.
This does not account for, however, the outbound slot required by the processor when a flush signal comes in. So, we can have the following situation:
- Accept pdata with exactly two slots left
- Allocate an outbound slot for non aggregable data to be passed through
- Overflow the existing aggregating batch (via stream count or ID overflow), thereby triggering a flush and using a second outbound slot
- Place the incoming data in the pending buffer after flushing
- Have a timer tick/wakeup message come in while the outbound slots are full which causes the processor to crash.
When step (5) happens we could choose to nack the data, but applying backpressure through the engines queues is the generally better solution than getting ourselves into a situation where we have to nack.
The solution is to change this to requiring three outbound slots available before accepting pdata.
Steps to Reproduce
Expected Behavior
Actual Behavior
OTel-Arrow Version
Environment
No response
Configuration
Log Output
Additional Context
No response
Pre-filing checklist
Component(s)
Rust OTAP dataflow (rust/otap-dataflow/)
Bug Description
The temporal reaggregation processor
accept_pdataimplementation gates on having one inbound and two outbound slots available, the maximum required to accommodate an incoming pdata.This does not account for, however, the outbound slot required by the processor when a flush signal comes in. So, we can have the following situation:
When step (5) happens we could choose to nack the data, but applying backpressure through the engines queues is the generally better solution than getting ourselves into a situation where we have to nack.
The solution is to change this to requiring three outbound slots available before accepting pdata.
Steps to Reproduce
Expected Behavior
Actual Behavior
OTel-Arrow Version
Environment
No response
Configuration
Log Output
Additional Context
No response