Skip to content

feat(cloud-task): Durable run streaming through agent-proxy#2519

Draft
charlesvien wants to merge 1 commit into
mainfrom
06-06-durable_streaming
Draft

feat(cloud-task): Durable run streaming through agent-proxy#2519
charlesvien wants to merge 1 commit into
mainfrom
06-06-durable_streaming

Conversation

@charlesvien

@charlesvien charlesvien commented Jun 8, 2026

Copy link
Copy Markdown
Member

Problem

Moves event streaming to agent-proxy

Changes

  1. Route the event-ingest POST to the standalone agent-proxy via new POSTHOG_TASK_RUN_EVENT_INGEST_URL
  2. Resolve the read leg once via stream_token, then read from the proxy with a run-scoped Bearer token (falls back to Django)
  3. Add a stream-end sentinel as the authoritative end-of-stream signal
  4. Make the reconnect loop status-unaware: stop only on stream-end or budget exhaustion, never on polled status
  5. Speed up reconnect backoff (flat 500ms for 3 attempts, then exponential to 30s; ~1.5s recovery vs ~16s)
  6. Cover ingest routing, read-leg routing and stream-end in tests

How did you test this?

Manually

Automatic notifications

  • Publish to changelog?
  • Alert Sales and Marketing teams?

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@charlesvien charlesvien changed the title Implement agent proxy feat(cloud-task): Durable run streaming through agent-proxy Jun 8, 2026
@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

T-Rex T-Rex Logs

What T-Rex did

  • Ran the existing cloud task service test file and all 24 tests passed.
  • Added a focused proxy stream edge-case test file with three tests for proxy URL encoding, stream_token fallback, and stream-end during bootstrap.
  • Verified that a proxy stream URL with URL-special characters in runId is parsed into the wrong path and query shape when the runId is not encoded.
  • Observed that the proxy stream URL test timed out due to the malformed URL not matching the mock URL pattern, confirming the URL handling issue.
  • Verified that stream-end during bootstrap deletes the watcher before historical logs finish loading, so the snapshot is not emitted.
  • Verified that transient stream_token resolution failures retry resolution on reconnect.
  • Verified that event-stream-sender.ts encodes runId with encodeURIComponent, preventing read-leg omissions.
Artifacts

Focused proxy stream edge-case test output

  • Shows the edge-case test results for proxy stream behavior to inspect test outcomes.

URL parsing proof for unencoded proxy run IDs

  • Demonstrates how unencoded run IDs affect URL path and query shaping.

Generated focused proxy stream edge-case tests

  • Contains the TypeScript source for the focused proxy stream edge-case tests.

Vitest output: 1 failed (URL encoding bug), 2 passed, 1 unhandled rejection (bootstrap race)

  • Summarizes the test run results for the proxy edge-case tests, including a failure and a bootstrap race.

Node.js proof: proxy stream URL is malformed when runId contains special chars; ingest URL is correct

  • Demonstrates how a runId with special characters breaks the constructed URL while the ingestion URL remains correct.

Generated proxy-stream-edge-cases TypeScript tests

  • Contains the TypeScript tests used for the proxy-stream-edge-cases run.

T-Rex Ran code and verified through T-Rex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/code/src/main/services/cloud-task/service.ts:654-658
**Encode proxy run IDs**

The proxy stream path inserts `watcher.runId` directly into the URL. If a run ID contains URL-special characters such as `/`, `?`, or `&`, `new URL()` treats those characters as path or query delimiters and the request goes to the wrong proxy route. The ingest side already encodes `runId`, so the read side should do the same.

```suggestion
    const url = new URL(
      usingProxy
        ? `${base}/v1/runs/${encodeURIComponent(watcher.runId)}/stream`
        : `${base}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
    );
```

### Issue 2 of 2
apps/code/src/main/services/cloud-task/service.ts:1188-1191
**Preserve bootstrap snapshot**

When `stream-end` arrives while bootstrap is still fetching historical logs, this branch stops and deletes the watcher before the bootstrap path can emit its snapshot. The pending `bootstrapWatcher()` then sees that the watcher is gone and returns, so subscribers never receive the existing run history. The `streamEnded && isBootstrapping` case needs to defer stopping until after bootstrap finishes.

Reviews (1): Last reviewed commit: "Implement agent proxy" | Re-trigger Greptile

Comment on lines 654 to 658
const url = new URL(
`${watcher.apiHost}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
usingProxy
? `${base}/v1/runs/${watcher.runId}/stream`
: `${base}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Encode proxy run IDs

The proxy stream path inserts watcher.runId directly into the URL. If a run ID contains URL-special characters such as /, ?, or &, new URL() treats those characters as path or query delimiters and the request goes to the wrong proxy route. The ingest side already encodes runId, so the read side should do the same.

Suggested change
const url = new URL(
`${watcher.apiHost}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
usingProxy
? `${base}/v1/runs/${watcher.runId}/stream`
: `${base}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
);
const url = new URL(
usingProxy
? `${base}/v1/runs/${encodeURIComponent(watcher.runId)}/stream`
: `${base}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
);
Artifacts

Malformed proxy stream URL proof

  • Keeps the command output available without making the summary code-heavy.

Focused proxy stream edge-case test output

  • Keeps the command output available without making the summary code-heavy.

T-Rex Ran code and verified through T-Rex

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/main/services/cloud-task/service.ts
Line: 654-658

Comment:
**Encode proxy run IDs**

The proxy stream path inserts `watcher.runId` directly into the URL. If a run ID contains URL-special characters such as `/`, `?`, or `&`, `new URL()` treats those characters as path or query delimiters and the request goes to the wrong proxy route. The ingest side already encodes `runId`, so the read side should do the same.

```suggestion
    const url = new URL(
      usingProxy
        ? `${base}/v1/runs/${encodeURIComponent(watcher.runId)}/stream`
        : `${base}/api/projects/${watcher.teamId}/tasks/${watcher.taskId}/runs/${watcher.runId}/stream/`,
    );
```

<details><summary><strong>Artifacts</strong></summary><br />

**[Malformed proxy stream URL proof](https://app.greptile.com/trex/artifacts/ddb62d93-805a-4883-af2f-93bfa316a395)**

- Keeps the command output available without making the summary code-heavy.

**[Focused proxy stream edge-case test output](https://app.greptile.com/trex/artifacts/195501f6-4213-4d1e-9e91-701e80403f1a)**

- Keeps the command output available without making the summary code-heavy.
</details>

<sub><a href="https://www.greptile.com/trex"><img alt="T-Rex" src="https://greptile-static-assets.s3.amazonaws.com/trex/trex_green.svg" height="14" align="absmiddle"></a> Ran code and verified through T-Rex</sub>

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1188 to +1191
if (watcher.streamEnded) {
this.emitStatusUpdate(watcher);
this.stopWatcher(key);
return;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Preserve bootstrap snapshot

When stream-end arrives while bootstrap is still fetching historical logs, this branch stops and deletes the watcher before the bootstrap path can emit its snapshot. The pending bootstrapWatcher() then sees that the watcher is gone and returns, so subscribers never receive the existing run history. The streamEnded && isBootstrapping case needs to defer stopping until after bootstrap finishes.

Artifacts

Focused stream-end during bootstrap repro output

  • Keeps the command output available without making the summary code-heavy.

T-Rex Ran code and verified through T-Rex

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/main/services/cloud-task/service.ts
Line: 1188-1191

Comment:
**Preserve bootstrap snapshot**

When `stream-end` arrives while bootstrap is still fetching historical logs, this branch stops and deletes the watcher before the bootstrap path can emit its snapshot. The pending `bootstrapWatcher()` then sees that the watcher is gone and returns, so subscribers never receive the existing run history. The `streamEnded && isBootstrapping` case needs to defer stopping until after bootstrap finishes.

<details><summary><strong>Artifacts</strong></summary><br />

**[Focused stream-end during bootstrap repro output](https://app.greptile.com/trex/artifacts/195501f6-4213-4d1e-9e91-701e80403f1a)**

- Keeps the command output available without making the summary code-heavy.
</details>

<sub><a href="https://www.greptile.com/trex"><img alt="T-Rex" src="https://greptile-static-assets.s3.amazonaws.com/trex/trex_green.svg" height="14" align="absmiddle"></a> Ran code and verified through T-Rex</sub>

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant