Skip to content

fix: coerce corpus route filters#1776

Merged
thedotmack merged 2 commits intothedotmack:mainfrom
suyua9:fix/corpus-routes-filter-coercion
Apr 15, 2026
Merged

fix: coerce corpus route filters#1776
thedotmack merged 2 commits intothedotmack:mainfrom
suyua9:fix/corpus-routes-filter-coercion

Conversation

@suyua9
Copy link
Copy Markdown
Contributor

@suyua9 suyua9 commented Apr 13, 2026

Summary

  • coerce types, concepts, and files in POST /api/corpus when clients send JSON-encoded or comma-separated strings
  • coerce string limit values to positive integers before passing filters into CorpusBuilder
  • add focused CorpusRoutes regression tests for native, JSON string, comma-separated, and invalid payload shapes

Why

CorpusBuilder assumes corpus filter arrays and calls .join(',') on types, concepts, and files. If an MCP/HTTP client sends string-encoded arrays, the route currently passes those strings through and the builder can throw at runtime. DataRoutes already handles the same MCP-client coercion shape for batch endpoints, so this keeps the boundary handling consistent.

Tests

  • bun test tests/worker/http/routes/corpus-routes-coercion.test.ts
  • bun test tests/worker/http/routes/data-routes-coercion.test.ts tests/worker/http/routes/corpus-routes-coercion.test.ts

tsc --noEmit was not runnable in this clean worktree because node_modules/@types/node is not installed locally (Cannot find type definition file for 'node').

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c07606c6-f032-4a50-a962-fa75d939787b

📥 Commits

Reviewing files that changed from the base of the PR and between 8951f57 and 730f81e.

📒 Files selected for processing (1)
  • tests/worker/http/routes/corpus-routes-coercion.test.ts
✅ Files skipped from review due to trivial changes (1)
  • tests/worker/http/routes/corpus-routes-coercion.test.ts

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened corpus API input validation for types, concepts, files, and limit.
    • Endpoint now accepts arrays, JSON-encoded strings, and comma-separated values; invalid or unsupported values result in HTTP 400.
  • Tests

    • Added tests covering request-body coercion, parsing variants, validation failures, and successful coerced payload handling.

Walkthrough

Adds request-body coercion and validation to the POST /api/corpus handler: introduces ALLOWED_CORPUS_TYPES, coerceStringArray(), and coercePositiveInteger(); normalizes types, concepts, files, and limit; rejects invalid inputs with HTTP 400 and only passes validated values to build().

Changes

Cohort / File(s) Summary
Core Implementation
src/services/worker/http/routes/CorpusRoutes.ts
Added ALLOWED_CORPUS_TYPES; implemented coerceStringArray() and coercePositiveInteger(); updated POST /api/corpus to coerce/validate types, concepts, files, and limit, set filter fields only when non-empty, and return 400 on invalid input.
Test Suite
tests/worker/http/routes/corpus-routes-coercion.test.ts
New tests exercising coercion: native arrays/numbers, JSON-encoded strings, comma-separated strings; asserts 400 on invalid array items, unsupported types, or malformed limit; verifies build() called with coerced payload on success.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Route as CorpusRoutes
  participant Builder as CorpusBuilder
  participant Res as Response

  Client->>Route: POST /api/corpus { types, concepts, files, limit }
  Route->>Route: coerceStringArray(types/concepts/files)
  Route->>Route: coercePositiveInteger(limit)
  alt invalid input
    Route->>Res: res.status(400).json(error)
  else valid input
    Route->>Builder: build({ filter, limit })
    Builder->>Route: build result
    Route->>Res: res.json(result)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble strings and count the hops,

I split the commas and trim the props,
I parse the JSON, make numbers bright,
I thump my paw when inputs aren't right,
A 400 thump keeps the corpus tight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: coerce corpus route filters' accurately and concisely summarizes the main change—adding request-body coercion and validation for filter parameters in the corpus route endpoint.
Description check ✅ Passed The description is directly related to the changeset, explaining the coercion of filter parameters, the rationale (consistency with DataRoutes), and testing approach.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/services/worker/http/routes/CorpusRoutes.ts (1)

15-15: Avoid duplicating the corpus-type allowlist.

This set now mirrors the literal union in src/services/worker/knowledge/types.ts. If a new corpus type is added there later, this route will start rejecting valid requests until both places are updated. Please move the runtime list into a shared exported constant and consume it here as the single source of truth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/worker/http/routes/CorpusRoutes.ts` at line 15, The route
currently defines a local ALLOWED_CORPUS_TYPES Set that duplicates the union
from the knowledge types module; remove this local definition in CorpusRoutes.ts
and import a single exported constant from the shared types module (e.g., export
a CORPUS_TYPES or ALLOWED_CORPUS_TYPES array/Set from the knowledge types file
where the union is declared), then use that imported constant in place of the
local Set (convert to a Set if necessary when doing membership checks) so the
route always reads the single source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/worker/http/routes/corpus-routes-coercion.test.ts`:
- Around line 136-160: Add a test that exercises the new allowlist branch for
types by creating a request with types: ['typo'] (e.g., name: 'bad-types',
types: ['typo']), invoking handler(req, res), awaiting flushPromises(), and
asserting statusSpy was called with 400 and mockBuild was not called; this
targets the CorpusRoutes types allowlist branch and uses the existing helpers
createMockReqRes, handler, flushPromises, statusSpy and mockBuild to locate and
verify the behavior.

---

Nitpick comments:
In `@src/services/worker/http/routes/CorpusRoutes.ts`:
- Line 15: The route currently defines a local ALLOWED_CORPUS_TYPES Set that
duplicates the union from the knowledge types module; remove this local
definition in CorpusRoutes.ts and import a single exported constant from the
shared types module (e.g., export a CORPUS_TYPES or ALLOWED_CORPUS_TYPES
array/Set from the knowledge types file where the union is declared), then use
that imported constant in place of the local Set (convert to a Set if necessary
when doing membership checks) so the route always reads the single source of
truth.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5eb64930-32e2-4e27-8024-d8cb52726c45

📥 Commits

Reviewing files that changed from the base of the PR and between cde4faa and 8951f57.

📒 Files selected for processing (2)
  • src/services/worker/http/routes/CorpusRoutes.ts
  • tests/worker/http/routes/corpus-routes-coercion.test.ts

Comment thread tests/worker/http/routes/corpus-routes-coercion.test.ts
@cuyua9
Copy link
Copy Markdown

cuyua9 commented Apr 14, 2026

Follow-up update: I addressed the CodeRabbit regression-test suggestion in the latest push (730f81e). The new test covers an unsupported corpus type value (types: ['typo']) and asserts that the route returns 400 before calling CorpusBuilder.\n\nVerification: bun test tests/worker/http/routes/corpus-routes-coercion.test.ts passes locally.

@thedotmack thedotmack merged commit eeb6841 into thedotmack:main Apr 15, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants