Stabilize Linux KVM CI on shared runners#284
Open
hiroTamada wants to merge 3 commits into
Open
Conversation
Limit host-level contention in CI and clean up VM helpers that survive timed-out tests, so Firecracker/QEMU integration runs do not leave pressure on deft-kernel-dev. Co-authored-by: Cursor <cursoragent@cursor.com>
Keep the run-scoped cleanup root short enough for Firecracker and Cloud Hypervisor Unix socket path limits. Co-authored-by: Cursor <cursoragent@cursor.com>
Keep the test temp root short for VMM socket limits while placing it under /tmp so the runner can create it. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
go test ./lib/instances -run TestWaitForProcessExit -count=1.github/workflows/test.ymllocally to verify the new concurrency and temp-dir env blocks.27217664557/ job80411086183; it did not include these changes and failed onTestFCUFFDOneShotLifecycleexec-agent readiness timeout.Made with Cursor
Note
Low Risk
Changes are limited to CI workflow, Makefile test invocation, and integration-test cleanup/timeouts; no production auth or API behavior changes.
Overview
Serializes Linux KVM CI on self-hosted runners via a shared
linux-kvm-ci-testconcurrency group (cancel-in-progress: false) so overlapping integration suites do not pile onto one host.CI test isolation uses run-scoped
TMPDIR(/tmp/hci{run_attempt}), kills orphanedfirecracker/cloud-hypervisor/hypeman-uffd-pagerprocesses tied to that path before and after tests, and runsmake testwithGO_TEST_PARALLELISM=4. The Makefile wires optional-parallel, and forwardsTMPDIR/HYPEMAN_TEST_NETWORK_TMPDIRinto the sudogo testenvironment.Test harness hardening: integration cleanup now scans
/procfor hypervisor helpers whose cmdline references the test data dir (used from manager and QEMU setups). FirecrackerrequireRunningSleepInstancereadiness polling is extended from 30s to 90s under load.Reviewed by Cursor Bugbot for commit 483a064. Bugbot is set up for automated code reviews on this repo. Configure here.