Skip to content

fix(pdf-reader): prevent local Path casting for remote s3 URIs to fix crash#21353

Open
spideyashith wants to merge 1 commit intorun-llama:mainfrom
spideyashith:fix/pdfreader-s3-path
Open

fix(pdf-reader): prevent local Path casting for remote s3 URIs to fix crash#21353
spideyashith wants to merge 1 commit intorun-llama:mainfrom
spideyashith:fix/pdfreader-s3-path

Conversation

@spideyashith
Copy link
Copy Markdown

Description

This PR addresses an issue where PDFReader forcefully casts remote cloud URIs (like s3://) into local Path objects, which strips the URI formatting and crashes the reader.

Fixes:

  • Added a check to skip local Path casting if the file variable is a remote URI string.
  • Updated the metadata extraction block to safely grab the file name regardless of whether the file is a Path object or a string.

Fixes #15406

New Package?

  • Yes
  • No

Version Bump?

  • Yes
  • No

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Apr 9, 2026
@spideyashith
Copy link
Copy Markdown
Author

"Hi @logan-markewich , I've implemented the fix for the S3 Path casting crash. Could you please approve the CI workflows to run? I've verified the fix locally on Windows."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: impossible to use PDfReader with an S3 file because of Path() casting

1 participant