Skip to content

Headless server leaks KQUEUE file descriptors on macOS (~1.6 per tool call) #2707

@PureWeen

Description

@PureWeen

Problem

The copilot headless server (copilot --headless --port <port>) leaks KQUEUE file descriptors on macOS. Each tool call (bash, grep, etc.) leaks ~1.6 kqueue handles that are never released, eventually exhausting resources and breaking all tool execution with EBADF errors.

Standalone Repro

kqueue-leak-repro.zip

A minimal .NET console app reproduces the leak using only the GitHub.Copilot.SDK:

// 1. Start a copilot headless server:  copilot --headless --port 4321
// 2. Run this repro:                   dotnet run
// 3. Watch kqueue count climb:         lsof -p <server-pid> | grep KQUEUE | wc -l

using GitHub.Copilot.SDK;

var client = new CopilotClient(new CopilotClientOptions
{
    CliUrl = "http://localhost:4321",
    UseStdio = false,
    AutoStart = false,
});
await client.StartAsync();

var session = await client.CreateSessionAsync(new SessionConfig
{
    OnPermissionRequest = PermissionHandler.ApproveAll,
});

for (int i = 1; i <= 50; i++)
{
    var response = await session.SendAndWaitAsync(
        new MessageOptions { Prompt = $"Run this exact bash command and report the output: echo 'hello from round {i}'" },
        timeout: TimeSpan.FromSeconds(30));
    
    // After each round, check: lsof -p <server-pid> | grep KQUEUE | wc -l
    // Count grows monotonically and never decreases.
}

Repro Results (50 tool calls)

Baseline: 112 kqueue handles

Round   1/50: kqueue=114 (+2 leaked)
Round  10/50: kqueue=127 (+15 leaked)
Round  25/50: kqueue=149 (+37 leaked)
Round  50/50: kqueue=191 (+79 leaked)

Baseline kqueue:  112
Final kqueue:     191
Leaked:           79
Leak rate:        ~1.6 per tool call

❌ KQUEUE LEAK CONFIRMED — handles grow monotonically and are never released.

Passive Monitoring (real-world usage over 45 minutes)

13:48  total=30   kqueue=3     (baseline, fresh server)
13:58  total=43   kqueue=3
14:03  total=62   kqueue=15
14:18  total=98   kqueue=52
14:33  total=149  kqueue=103   (+100 kqueue in 45 min)

Long-term Impact (9-day server uptime)

After 9 days of continuous operation, the server accumulated 9,594 leaked KQUEUE handles (10,321 total FDs):

$ lsof -p <pid> | awk '{print $5}' | sort | uniq -c | sort -rn | head -3
9594 KQUEUE
 436 unix
 242 CHR

All sessions lost bash/shell tool access simultaneously. Sessions reported EBADF or "Failed to start bash process". Only recovery was killing and restarting the headless server.

Analysis

The copilot headless server is a compiled Node.js binary. Node.js uses libuv which creates kqueue event watchers on macOS for child process management. Each tool subprocess (bash, grep, etc.) gets a kqueue watcher that is never closed after the subprocess exits.

  • The leak is in the CLI server process, not in the SDK client
  • The SDK client connects via TCP in persistent/headless mode and does not spawn subprocesses
  • The leak is proportional to tool call volume, not time
  • Kqueue handles are never released during normal operation — only a server restart clears them

Environment

  • macOS ARM64
  • copilot CLI bundled with GitHub.Copilot.SDK 0.2.1
  • Tested on both long-running server (9 days) and fresh server (repro completes in minutes)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions