Skip to content

fix: support gemini transcript parsing in summarize hook#1804

Closed
themactep wants to merge 2 commits intothedotmack:mainfrom
themactep:fix/gemini-transcript-parsing
Closed

fix: support gemini transcript parsing in summarize hook#1804
themactep wants to merge 2 commits intothedotmack:mainfrom
themactep:fix/gemini-transcript-parsing

Conversation

@themactep
Copy link
Copy Markdown

This PR adds support for gemini transcript parsing in the summarize hook, detecting platform source and handling both JSON and JSONL formats with cross-platform role mappings.

- detect platform source from hook metadata and include platform fields in summarize/complete requests
- parse both JSON and JSONL transcript formats with cross-platform role mappings
- rebuild generated plugin artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df9d720c-070f-49f3-83cf-ea2b18acfd13

📥 Commits

Reviewing files that changed from the base of the PR and between 189a19d and 0816fad.

📒 Files selected for processing (2)
  • plugin/scripts/worker-service.cjs
  • src/shared/transcript-parser.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/shared/transcript-parser.ts

Summary by CodeRabbit

  • New Features

    • Support for both JSON and line-delimited JSON transcript formats
  • Improvements

    • Include platform source indicator in summary requests and completions
    • More robust message extraction across platform transcript shapes and role/type variants
    • Skip empty/missing message content and concatenate multi-part text reliably
    • Improved error logging and parsing fallback behavior

Walkthrough

The summarize handler now derives and includes a platform source ('gemini-cli' or 'claude') from hook metadata in its API payloads. The transcript parser now accepts JSON objects/arrays or JSONL, improves role/type matching, and strengthens content extraction and error handling.

Changes

Cohort / File(s) Summary
Platform Source Tracking
src/cli/handlers/summarize.ts
Derives platformSource from metadata?.hook_event_name ('gemini-cli' if present, else 'claude') and includes it in /api/sessions/summarize and /api/sessions/complete payloads; session completion call now sends contentSessionId plus platform_source/platformSource.
Transcript Parser Format & Robustness
src/shared/transcript-parser.ts
Refactored extractLastMessage to parse either JSON (object/array with messages) or JSONL (fallback), improved role matching to platform-specific type values (`assistant

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through lines both new and old,

Found JSON, JSONL, and stories told.
Gemini, Claude—now both in view,
Payloads carry the platform true.
Tiny paws applaud the clever glue.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding support for Gemini transcript parsing in the summarize hook, which is the primary objective of this PR.
Description check ✅ Passed The description is directly related to the changeset, covering platform source detection, JSON/JSONL format handling, and cross-platform role mappings mentioned in the file changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/shared/transcript-parser.ts (1)

30-42: Avoid exception-driven format detection on the hot path.

Claude JSONL still pays for a full JSON.parse(...) failure before the real parse. A cheap first-character check would keep the common path out of exception handling.

♻️ Possible simplification
-  // Try parsing as standard JSON (Gemini CLI format)
-  try {
-    const data = JSON.parse(rawContent);
-    if (Array.isArray(data.messages)) {
-      messages = data.messages;
-    } else if (Array.isArray(data)) {
-      messages = data;
-    } else {
-      isJSONL = true;
-    }
-  } catch {
-    isJSONL = true;
-  }
+  // Try parsing as standard JSON (Gemini CLI format)
+  const firstChar = rawContent[0];
+  if (firstChar === '{' || firstChar === '[') {
+    try {
+      const data = JSON.parse(rawContent);
+      if (Array.isArray(data.messages)) {
+        messages = data.messages;
+      } else if (Array.isArray(data)) {
+        messages = data;
+      } else {
+        isJSONL = true;
+      }
+    } catch {
+      isJSONL = true;
+    }
+  } else {
+    isJSONL = true;
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/shared/transcript-parser.ts` around lines 30 - 42, The current code
always calls JSON.parse(rawContent) inside a try/catch which is expensive on the
hot path; replace that with a cheap first-character check on rawContent (e.g.,
const first = rawContent.trimStart()[0]) and only attempt JSON.parse when first
=== '[' or first === '{' (then run the existing JSON.parse block to populate
messages or mark isJSONL accordingly); otherwise set isJSONL = true immediately.
Update the logic around the JSON.parse call and variables rawContent, isJSONL,
and messages so you avoid throwing/handling exceptions for non-JSON transcripts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/shared/transcript-parser.ts`:
- Around line 68-75: The loop over messages should skip non-object/nullable
entries before accessing msg.type; inside the for loop that iterates messages
(the block using foundMatchingRole and calling targetTypes.includes(msg.type)),
add a guard like if (typeof msg !== 'object' || msg === null) continue so
primitive/null entries are skipped, then proceed to use
targetTypes.includes(msg.type) and extract msg.message?.content or msg.content
safely (refer to variables/functions: messages, targetTypes, foundMatchingRole,
msgContent, msg.message?.content).

---

Nitpick comments:
In `@src/shared/transcript-parser.ts`:
- Around line 30-42: The current code always calls JSON.parse(rawContent) inside
a try/catch which is expensive on the hot path; replace that with a cheap
first-character check on rawContent (e.g., const first =
rawContent.trimStart()[0]) and only attempt JSON.parse when first === '[' or
first === '{' (then run the existing JSON.parse block to populate messages or
mark isJSONL accordingly); otherwise set isJSONL = true immediately. Update the
logic around the JSON.parse call and variables rawContent, isJSONL, and messages
so you avoid throwing/handling exceptions for non-JSON transcripts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b522b3c7-2340-4d47-91ff-b77069ae4ee4

📥 Commits

Reviewing files that changed from the base of the PR and between cde4faa and 189a19d.

📒 Files selected for processing (5)
  • plugin/scripts/mcp-server.cjs
  • plugin/scripts/worker-service.cjs
  • plugin/ui/viewer-bundle.js
  • src/cli/handlers/summarize.ts
  • src/shared/transcript-parser.ts

Comment thread src/shared/transcript-parser.ts
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@thedotmack
Copy link
Copy Markdown
Owner

Closed during the April 2026 backlog cleanup. The underlying bug is now tracked in #1909, which is the single canonical issue being addressed by the maintainer. Thanks for taking the time to report — your symptoms and repro are captured in the consolidated ticket.

@thedotmack thedotmack closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants