MS legal/monetary liability for rate limit behavior

### Describe the bug

As currently implemented, the rate limit system does two significant things:

1. Causes rate-limit events without user action; it is trivial for the software and service to rate-limit itself. /fleet, background agents, or even an individual agent running multiple commands  too quickly,
2. Effectively obliterates work the user has paid for where a sub-agent was doing non-trivial or highly destructive work,

When a non-trivial sub-agent encounters a rate limit, the software appears to simply fail the agent/context, causing the *parent* agent to kick in, and often fail too.

In one case, a sub-agent had worked for nearly 2 hours on a single large file only to corrupt it by making a one-line change with no newline, then decided to rewrite the file. It tried to do it with a here-document command but messed up the escaping, so then it tried to do it again using the write-file tool but checking on the file after every few lines.

Right about here I asked the idling parent agent how long it would take. It told me the sub-agent was preparing the final commit, and that's when we hit the rate limit. I waited the minute it suggested and asked the agent to resume, but the background task was gone, and the parent agent couldn't tell what the status was. So it began doing its own evaluation of where the worker had gotten to, spending nearly 30 minutes re-reading code, logs, etc, and hitting two compaction events.

At that point my only option was a rewind, but that undid everything the sub-agent had spent 2 hours working on, there was no checkpoint at/near the rate limit event.

While I recognize that MS Legal will wish to argue that users are *solely* paying for the generation of random tokens, your consolidation of all things AI into the Copilot *branding* means that ship has sailed, sunk, been discovered by Bob Ballard, looted, raised, sank on the way back to dock, and the Mk II is under construction by an Australian billionaire.

I can't even tell for sure if the agents *see* the rate limit message. In previous versions it seemed like they did, but the wording ([emphasis mine] "Sorry, you've hit a rate limit that restricts the number of Copilot model requests you can make within a specific time period. Please try again in 1 minute. _Please review our Terms of Service_ (...)") actively seems to make GPT think the user is doing something nefarious.

### Affected version

GitHub Copilot CLI 1.0.25.

### Steps to reproduce the behavior

Use /fleet,
Give agent a task that may run multiple parallel sub-agents,
Give agent a task where the model predicts many micro-operations instead of individual larger ones (GPT 5.4 reading 200 small files individually with shell commands rather than using the read-file tool),
Agent triggers multiple sub-agents in parallel rather than serially,



### Expected behavior

- Internal self-rate limiting to avoid hitting the limits in the first place,
- Automatic back-off/retry in the client with user-facing surfacing of the issue,
- sub-agent resume/checkpoint,
- 

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MS legal/monetary liability for rate limit behavior #2712

Describe the bug

Affected version

Steps to reproduce the behavior

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MS legal/monetary liability for rate limit behavior #2712

Description

Describe the bug

Affected version

Steps to reproduce the behavior

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions