feat(python): expose VFS to custom builtins via BuiltinContext.fs#2010
feat(python): expose VFS to custom builtins via BuiltinContext.fs#2010dedeswim wants to merge 9 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Expose a live virtual filesystem handle (ctx.fs) to Python custom_builtins, making filesystem access safe inside callbacks even when the interpreter is already running under a Tokio runtime (e.g., execute_sync / Jupyter and async servers).
Changes:
- Add
BuiltinContext.fsto the Python API + type stubs and expand documentation to define semantics and re-entrancy constraints. - Implement a runtime-safe dispatch path for
PyFileSystem::with_fsthat avoids nested-Tokio-runtime panics by offloading static-handle ops to a worker thread. - Add test coverage across “normal”, Jupyter/live-loop, and FastAPI concurrency scenarios for
ctx.fs.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| specs/python-package.md | Documents BuiltinContext.fs semantics, runtime behavior, and expected performance characteristics. |
| crates/bashkit-python/src/lib.rs | Implements ctx.fs plumbing and adds worker-thread dispatch in PyFileSystem::with_fs to avoid nested runtime panics. |
| crates/bashkit-python/bashkit/_bashkit.pyi | Updates typing docs/stubs to include BuiltinContext.fs. |
| crates/bashkit-python/tests/test_custom_builtin_fs.py | New comprehensive tests validating ctx.fs behavior, error propagation, readonly behavior, and re-entrancy scenarios. |
| crates/bashkit-python/tests/test_jupyter_compat.py | Adds tests confirming ctx.fs works under asyncio.run and while an event loop is already running. |
| crates/bashkit-python/tests/test_fastapi_integration.py | Adds concurrent-request test ensuring ctx.fs works in async endpoints without deadlocking. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fn with_fs<T, F, Fut>(&self, f: F) -> PyResult<T> | ||
| where | ||
| F: FnOnce(Arc<dyn FileSystem>) -> Fut, | ||
| Fut: Future<Output = PyResult<T>>, | ||
| F: FnOnce(Arc<dyn FileSystem>) -> Fut + Send, | ||
| Fut: Future<Output = PyResult<T>> + Send, | ||
| T: Send, |
| // A `Static` handle (e.g. a custom builtin's `ctx.fs`) operates on a | ||
| // cloned `Arc<dyn FileSystem>` with no interpreter lock. When such a | ||
| // handle is used while a tokio runtime is already active on this thread | ||
| // — `execute_sync` drives the interpreter via `self.rt.block_on` and the | ||
| // custom builtin callback runs within it; the same holds on a | ||
| // multi-thread runtime worker when `await execute()` drives a *sync* | ||
| // builtin — calling `block_on` again here panics ("Cannot start a | ||
| // runtime from within a runtime"). Run the op on a throwaway thread with | ||
| // its own runtime instead. | ||
| // | ||
| // Cost: this spawns one OS thread and builds one current-thread runtime | ||
| // per op, so a callback doing many `ctx.fs` ops in a tight loop pays | ||
| // that scaffolding each time. Fine for occasional `ctx.fs` use; if it | ||
| // ever gets hot, amortize with a long-lived worker thread + runtime | ||
| // reused across ops via a channel (note: `block_in_place` is not an | ||
| // option while `make_runtime` builds a current-thread runtime). |
| // Drop the panic payload deliberately: a `Box<dyn Any>` can't | ||
| // be formatted without downcasting and may carry internal | ||
| // Debug shapes / host paths (TM-INF-022). The default panic | ||
| // hook still prints the full trace to stderr for debugging. | ||
| .map_err(|_| PyRuntimeError::new_err("filesystem worker thread panicked"))? |
| fs: Live handle to the interpreter's virtual filesystem. Reads see | ||
| files created by earlier commands and writes are visible to later | ||
| ones. Same API as ``Bash.fs()`` — e.g. ``ctx.fs.read_file(path)`` | ||
| returns ``bytes``. Operates directly on the interpreter's VFS, so | ||
| it is safe to use from inside the callback. |
chaliy
left a comment
There was a problem hiding this comment.
Thanks for the deep, well-tested contribution — and for the candid disclosure. The scrutiny it invited paid off in exactly one place, and that place is not your fault.
First, apologies on our side: #2009 (the deterministic-teardown rewrite) merged to main the same day you opened this, after your base commit, and it changed PyRuntime::drop semantics underneath this PR. Your branch merges cleanly with main textually, but the combination introduces a real panic that neither your suite nor ours catches.
The issue
PyCustomBuiltinAdapter now holds PyRuntime clones. If the Bash object is dropped while an await execute() is in flight, the in-flight future holds the last Arc<Mutex<Bash>>; when it completes on a pyo3-async-runtimes worker thread, Bash drops there → registry → adapters → the last PyRuntime clone drops inside a runtime context. On current main, PyRuntime::drop does a blocking join (join_without_gil(move || drop(rt))), and tokio panics — Cannot drop a runtime in a context where blocking is not allowed — surfacing as pyo3_async_runtimes.RustPanic to the awaiting caller.
Verified repro on this PR merged with main (the same script passes on pure main, and the full 740-test suite stays green on the merge, so CI will not catch this):
async def slow(ctx):
await asyncio.sleep(0.3); return "done"
b = Bash(custom_builtins={"slow": slow})
fut = asyncio.ensure_future(b.execute("slow"))
await asyncio.sleep(0.05)
del b; gc.collect()
await fut # RustPanic on PR+main; exit 0 on pure mainRequested before merge
- Rebase onto current
main(the "Rebased on main" checklist item was true when written — main moved under you). - Fix the drop path — e.g. in
PyRuntime::drop, treat "inside a tokio context" like theinterpreter_at_exit()case:if Handle::try_current().is_ok() { rt.shutdown_background() } else { join_without_gil(...) }. That preserves #2009's determinism on the normal path, where the last clone drops on a Python thread. - Add the del-while-in-flight scenario above as a regression test.
Everything else checks out under deep review: I verified all four execute × callback dispatch modes, readonly enforcement, error/leak sanitization (TM-INF-022), and measured the worker-thread path at ~80µs/op vs ~1.7µs on the fast path with no fd churn — acceptable for the documented "occasional use" tradeoff. Two smaller inline comments accompany this review.
Generated by Claude Code
| // the `external_handler` reentry guard (which only fires inside an | ||
| // external_handler, not a custom builtin); the loud panic is preferable | ||
| // to silently deadlocking or corrupting interpreter state. | ||
| if inner.is_static() && Handle::try_current().is_ok() { |
There was a problem hiding this comment.
The comment above documents two paths — the worker-thread branch (Static handle inside a runtime) and Live re-entrancy — but there is a third path worth a sentence: a Static handle used from a thread with no tokio context at all. That is exactly what async callbacks hit, since their body runs on an asyncio loop thread (the caller's loop under await execute(), the private loop under execute_sync()). Those fall through to self.rt.block_on below. Under execute_sync() that is a second-thread block_on while the main thread holds the current-thread scheduler core — it only completes because tokio's parker fallback polls the future without owning the core. That's fine for VFS ops (they never need the runtime's timer/IO drivers), but worth stating explicitly so nobody later adds a timer-dependent fs future and stalls this path.
I verified the execute_sync + async-callback + ctx.fs combination works (including inside asyncio.run), but it's the one dispatch mode the new tests don't cover. Could you add it to test_custom_builtin_fs.py (or the jupyter-compat block) alongside extending this comment?
Generated by Claude Code
There was a problem hiding this comment.
Custom-builtin callbacks now receive a live `fs` handle on `BuiltinContext`,
wrapping the same `Arc<dyn FileSystem>` the interpreter runs on (mirroring how
the embedded `python3`/Monty builtin receives `ctx.fs`). Reads see files
created by earlier commands; writes are visible to later ones.
A callback runs inside `execute_sync`'s current-thread `block_on`, so a naive
`fs` op would `block_on` again and panic ("runtime within runtime"). For
`Static` handles `PyFileSystem::with_fs` now detects an active runtime and runs
the op on a throwaway worker thread; `Live` handles keep the fast path.
Extract PyCustomBuiltinAdapter::from_entry (used by the registry builder and both add_builtin sites), and reuse the existing make_runtime() helper for the off-thread ctx.fs dispatch instead of an inline runtime builder.
Adds tests for the new BuiltinContext.fs surface across the contexts where it is actually used. Async schedulers (per maintainer request), placed in the existing scheduler suites so they run in CI: - test_jupyter_compat.py: ctx.fs from a custom builtin via execute_sync inside asyncio.run(), execute_sync under a live loop, and await execute on the caller loop (Bash + BashTool). - test_fastapi_integration.py: concurrent async requests over one shared Bash, each round-tripping the VFS through ctx.fs (ASGITransport + gather). Feature edge cases in test_custom_builtin_fs.py: missing-file clean error (no host-path leak), write-then-read within one callback, readonly_filesystem denial, lazy-provider read via ctx.fs, and cross-handle consistency.
- _bashkit.pyi: drop the "(unlike calling back into Bash.fs()/read_file())" parenthetical from the BuiltinContext.fs docstring. - test_custom_builtin_fs.py: assert errors with `with pytest.raises(RuntimeError)` (exact type, no bare except) for the missing-file and readonly cases; assert the VFS state after the callback returns (common-ops, readonly); add an async-callback test that writes through ctx.fs. - test_fastapi_integration.py: assert each request's file content on the shared VFS after the concurrent run. - test_jupyter_compat.py: assert the await-execute ctx.fs builtin runs on the caller's event loop (matching the existing caller-loop test).
Per review: keep the in-callback `with pytest.raises(RuntimeError)` that pins the exact exception at the ctx.fs boundary, and add an outer check at the call site. A callback exception does not propagate out of execute_sync (it becomes a non-zero exit_code), so the call-site assertion uses execute_sync_or_throw and expects a sanitized BashError.
The with_fs comment and spec said re-entrant Live fs 'must not silently deadlock' / was 'rejected/unsupported re-entrancy'. The actual behavior is a tokio nested-runtime panic, and the external_handler reentry guard does not fire for custom builtins. Reword both to name the real failure mode. Also document that each ctx.fs op on the worker-thread path spawns a short-lived thread + runtime (batch fs work in callbacks), note the deliberate panic-payload drop (TM-INF-022), and that block_in_place is not an option while make_runtime builds a current-thread runtime.
- test_ctx_fs_supports_more_ops: append_file/stat/rename/copy/chmod/ symlink/read_link/remove round-trip through ctx.fs on the live VFS. - test_ctx_fs_nested_reentry_via_lazy_provider: a lazy provider triggered via ctx.fs.read_file re-enters ctx.fs through a captured handle while already on the worker-thread runtime, exercising recursive worker-thread dispatch without deadlocking.
Rebasing onto main pulled in the async-callback teardown fix (everruns#2007/everruns#2008), which wraps the interpreter runtime in PyRuntime (Option<Arc<Runtime>> with a Drop that calls shutdown_background only when the last clone is released via Arc::into_inner). PyFileSystem::from_static now takes PyRuntime, so thread PyRuntime through the ctx.fs plumbing (make_py_builtin_context, PyCustomBuiltinAdapter, build_runtime_custom_builtin_impls, populate_registry_from_entries) instead of the raw Arc<Runtime>. Cloning a PyRuntime into each invocation's ctx.fs handle is a refcount bump on the same Arc; deterministic teardown still fires only when the last clone drops.
…o context Addresses review feedback on the ctx.fs PR after everruns#2009 (deterministic teardown) landed and changed PyRuntime::drop underneath it. everruns#2009's PyRuntime::drop does a blocking runtime join (join_without_gil) while the interpreter is alive. The ctx.fs adapters now hold PyRuntime clones, so when a Bash is dropped while `await execute()` is in flight, the in-flight future drops the last Arc<Mutex<Bash>> — and thus the last PyRuntime clone — on a pyo3-async-runtimes worker thread. A blocking runtime drop inside a tokio context panics ("Cannot drop a runtime in a context where blocking is not allowed"), surfacing as RustPanic to the awaiting caller. Neither suite caught it because the merge stays green. - PyRuntime::drop: also use shutdown_background() when Handle::try_current().is_ok(), keeping everruns#2009's deterministic join_without_gil only on the normal Python-thread path. - Regression test: drop a Bash while an async-callback execute() is in flight, then await it (test_teardown_determinism.py). Verified it panics without the fix and passes with it. - Document and cover the third with_fs dispatch path: a Static handle used from a thread with no tokio context (async callbacks under execute_sync's private loop / await execute's caller loop) falls through to self.rt.block_on. Adds execute_sync + async-callback + ctx.fs tests (test_jupyter_compat.py). - Docs: note that a stashed ctx.fs handle outlives the Bash and keeps the VFS + runtime alive (specs + .pyi), and the in-tokio-context drop handoff in the teardown-determinism section.
|
Hi @chaliy thank you so much for the quick review and the detailed instructions! I hope the latest changes address your comments. Let me know if there is more I can do! |
Disclosure
I'm not an experienced Rust developer: the Rust here was generated with Claude. I did review the Python/pytest side myself: I went through the tests, asked for additional targeted cases (error/readonly/lazy-provider, nested re-entry, async schedulers), and to my judgment they're correct and exercise the behavior that matters. Scrutiny of the Rust especially welcome.
What
Adds an
fsattribute toBuiltinContext, giving Pythoncustom_builtinscallbacks read/write access to the interpreter's live VFS — completing the follow-up noted in thecustom_builtinsdecision comment inlib.rs.ctx.fsis the existingFileSystemhandle (same API asBash.fs()), wrapping the sameArc<dyn FileSystem>the interpreter runs on, like the embeddedpython3/Monty builtin. So it's a live view (reads see earlier writes) and inheritsreadonly_filesystem/ mounts.Closes #2004.
Note on dispatch
A callback runs inside
execute_sync'sblock_on, so a naivefsop would re-enter the runtime and panic.PyFileSystem::with_fsdetects the active runtime and runsctx.fsops on a throwaway worker thread (GIL released first, so lazy file providers can re-enter Python);Bash.fs()keeps the fast path.Test plan
cargo fmt --check,cargo clippy -p bashkit-python,ruff check/ruff format --check— clean.test_custom_builtin_fs.pyplusctx.fs-under-async-scheduler cases intest_{jupyter_compat,fastapi_integration}.py; fullbashkit-pythonsuite green on CI (3.9 / 3.12 / 3.13 / 3.14).Checklist
just pre-prpassesspecs/python-package.md)