Skip to content

fix(608): anchor pop-on->roll-up transition start time to first character, not CR#2290

Open
x15sr71 wants to merge 1 commit into
CCExtractor:masterfrom
x15sr71:fix/608-rollup-transition-start-time
Open

fix(608): anchor pop-on->roll-up transition start time to first character, not CR#2290
x15sr71 wants to merge 1 commit into
CCExtractor:masterfrom
x15sr71:fix/608-rollup-transition-start-time

Conversation

@x15sr71

@x15sr71 x15sr71 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

In raising this pull request, I confirm the following (please check boxes):

Reason for this PR:

  • This PR adds new functionality.
  • This PR fixes a bug that I have personally experienced or that a real user has reported and for which a sample exists.
  • This PR is porting code from C to Rust.

Sanity check:

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • If the PR adds new functionality, I've added it to the changelog. If it's just a bug fix, I have NOT added it to the changelog.
  • I am NOT adding new C code unless it's to fix an existing, reproducible bug.

Repro instructions:

This PR fixes Regression Test 84 which is currently failing on both linux and windows. Reference Test Run - Windows, Linux.


Problem

Regression test 84 (sample 79, f23a544b..., --out=srt --latin1) fails exact SHA-256 comparison against reference e5c9d9ec... Confirmed via CI history (sample-platform DB) failing continuously since test_id 6698 (Dec 2025, shortly after 54df50f merged), including at the v0.96.6 release build (test 8446/8447).

8 caption entries in an 11,000-line file show incorrect start times when a 608 stream transitions from pop-on to roll-up mode mid-broadcast:

Entry Reference (correct) Current output
20 00:01:06,499 00:01:09,102
2164 00:19:03,299 00:19:05,768
(6 more, same pattern)

End times match exactly in all 8 cases; only start times diverge, by 2-4.5s.

Root cause

ccx_decoders_608.c, COM_CARRIAGERETURN handler. Introduced by 54df50f (2025-12-13, "preserve CR time during pop-on to roll-up transition"):

if (context->rollup_from_popon && !changes)
    context->ts_start_of_current_line = get_fts(...);

When a stream transitions pop-on -> roll-up, CRs with changes==0 (roll-up window not yet full) fire repeatedly while lines accumulate. This line unconditionally reassigns on every such CR, not just the first. Confirmed
via --608 debug trace on sample 79: CR#1/CR#2/CR#3 each overwrite the value; the eventual changes==1 CR (which emits the cue) inherits CR#3's time instead of the transition's actual start.

Fixing only the overwrite closed ~98% of the gap (2.6s -> 66ms residual, uniform across all 8 entries, ~2 field-durations at 29.97fps). Runtime instrumentation (fts logged at CR, PAC, mid-row text-attribute, and first character) showed the reference anchors to the fts of the first character written after the transition, not the CR:

entry 20: CR fts=66433, first-char fts=66499, reference=66499 (exact)
(7 more entries, 0ms deviation each)

This matches pre-existing behavior in write_char(): if (ts_start_of_current_line == -1) ts_start_of_current_line = get_fts(...) — confirmed via git log -S, this guard is unmodified since commit 617d2d3 (2014-10-07), 11+ years before 54df50f. The fix removes 54df50f's override on the changes==0 path and lets that guard govern again.

Cross-check against 54df50f's own stated verification: its PR (#1808) reports, for sample 725a49f871 (also used below): "Before fix: 1,501ms... After Fix 3: 2,118ms (133ms late)... After Fix 4: 1,985ms (0ms offset, shipped)" measured against ffmpeg. Our data for this same file: the sample-platform's actual stored reference is 2,118ms for this caption —
matching "after Fix 3," not the shipped "after Fix 4" value. This suggests Fix 3 alone was already correct here, and Fix 4's ffmpeg-based verification was itself mismeasured for this transition (consistent with ffmpeg's eia608 decoder failing outright on sample 79 — not a reliable oracle forthis corpus).

Fix

Two changes, both required:

  1. In the COM_CARRIAGERETURN handler: don't touch ts_start_of_current_line on changes==0 CRs during a transition. If still -1, the first subsequent write_char() call sets it (per the 2014 guard above). If already set, preserve it.

  2. In write_cc_line()'s caller (CCX_OF_TRANSCRIPT output path): reset ts_start_of_current_line = -1 immediately after each call. Transcript mode treats every CR as a new line boundary and expects this reset; without it, fix (1) causes a value to leak into the next line's start time. Found via regression test 98 (sample 725a49f871...mpg): applying fix (1) alone regressed one line (133ms -> 617ms error vs reference); adding this reset restored an exact match.

Verification

  • sha256sum of full ccextractor output on sample 79 matches reference e5c9d9ec... exactly. Diff before fix: 8 lines. After: 0, full file.
  • Regression test 98 (725a49f871.mpg) and 156 (c83f765c.ts), 54df50f's own motivating samples: neither currently hash-matches its stored reference, and this predates this PR (confirmed via CI history, both platforms). This PR does not regress either: test 98's affected line is restored to exact match; test 156's largest timing error improves from 1301ms to 67ms. Remaining deltas in both (an XDS-path timing offset, a # vs music-note character substitution from 300f8ca, and an unexplained residual ~67ms/2-field offset) are unrelated to this fix and out of scope for this PR.
  • Not a fix for the WTV 751ms global-offset limitation noted in fix(timing): correct caption start/end times to match video frame PTS #1808 — that's a uniform epoch mismatch across a whole file; this PR's target file hash-matches its reference in full, ruling out a uniform offset here. Different mechanism, likely addressed separately by 300f8ca.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant