Add postponed CVE issues reprocessing research#235
Conversation
Document mechanism for periodically reprocessing postponed CVE issues in Ymir. Includes 3-phase implementation approach: basic sweeps, backoff optimization with Redis, and follow-up improvements.
lbarcziova
left a comment
There was a problem hiding this comment.
thanks for a comprehensive research, looking forward to start working on this!
| - `ymir_postponed_no_patch` — no upstream patch found yet | ||
| - `ymir_postponed_pr_pending` — patch identified but not yet merged |
There was a problem hiding this comment.
I'm wondering if we should treat this the same, in case the pending PR would get closed and other approach would be chosen.
There was a problem hiding this comment.
agreed, sounds like we should do a full triage once we are unblocked
There was a problem hiding this comment.
I agree the unblock action will be the same, however I would keep them separate because ymir_postponed_no_patch requires a full re-triage every time we check it, so I would re-run it less frequently.
The ymir_postponed_pr_pending, instead, needs a github/gitlab check - to verify if the PR (which will be linked in the issue by the triage agent) has been approved, so we can run it more frequently. If the PR has been merged then we can invoke the triage agent again.
And yes, I agree that if PR is closed then we should consider falling back to triage.
|
|
||
| ## Sweep Types and Timing | ||
|
|
||
| ### 1. Dependency CVE Sweep |
There was a problem hiding this comment.
for this one, we actually added the check if the new build is present in compose, I think we should be checking also that one now (if fixed in build is set), could you please re-consider it with that? Should be just different programmatic check
|
|
||
| ## Backoff Optimization with Redis (Optional - Phase 2) | ||
|
|
||
| **Why backoff is needed:** Without backoff, all postponed issues are checked on every sweep, even if recently verified as still blocked. This wastes API calls and agent tokens, especially for issues blocked for weeks. |
There was a problem hiding this comment.
I'm not fully convinced the backoff is necessary/wanted here, I think with most of these there is not much predictability on when we get unblcoked. For the expensive case (rerunning the triage agent) a longer fixed interval would be simpler. But open to discussion. Definitely would start with phase 1 only as you propose!
| - `ymir_postponed_no_patch` — no upstream patch found yet | ||
| - `ymir_postponed_pr_pending` — patch identified but not yet merged |
There was a problem hiding this comment.
agreed, sounds like we should do a full triage once we are unblocked
| - Time-to-unblock (how long issues stay postponed) | ||
| - Backoff behavior (attempt count distribution) | ||
|
|
||
| ### Redis Metrics (Optional - Phase 2 only) |
There was a problem hiding this comment.
really love this part that we'll be able to measure these issues and act based on real data
- Add note that PR pending issues should re-triage from scratch if PR gets closed - Update Y-stream sweep to check build in buildroot using check_build_in_buildroot() - Clarify all postponed categories trigger full re-triage when unblocked (not just push to queue) - Acknowledge backoff is optional, fixed intervals may be simpler given unpredictability Assisted-by: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
- Add note that PR pending issues should re-triage from scratch if PR gets closed - Update Y-stream sweep to check build in buildroot using check_build_in_buildroot() - Clarify all postponed categories trigger full re-triage when unblocked (not just push to queue) - Acknowledge backoff is optional, fixed intervals may be simpler given unpredictability Assisted-by: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
Document mechanism for periodically reprocessing postponed CVE issues in Ymir. Includes 3-phase implementation approach: basic sweeps, backoff optimization with Redis, and follow-up improvements.