Skip to content

Add ModelRun.total_cost and total_data_rows#2057

Merged
apollonin merged 4 commits into
developfrom
dapollonin/model-run-cost-and-data-rows
Jun 10, 2026
Merged

Add ModelRun.total_cost and total_data_rows#2057
apollonin merged 4 commits into
developfrom
dapollonin/model-run-cost-and-data-rows

Conversation

@apollonin

@apollonin apollonin commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

What

Adds two lazy properties to ModelRun:

model_run = client.get_model_run(model_run_id)
model_run.total_cost       # float | None  (USD)
model_run.total_data_rows  # int | None    (data rows processed)
model_run.refresh_cost_and_usage()  # bust the cache

Why

Model Foundry users (DoubleVerify) currently log cost and dataset size by hand. This lets them pull both straight off the ModelRun they already fetch — matching the workflow in their existing scripts (client.get_model_run(...)).

How

On first access, a single modelFoundryModelRunInfo query is issued (which proxies to model_service's already-computed per-job cost + data-row count) and cached on the instance. Nothing is persisted on the run — it's real-time rehydration. Runs not backed by a Foundry model job return None (the LabelboxError is swallowed).

The get_model_run query itself is unchanged, so existing fetches stay cheap; the extra round trip only happens when cost/usage is actually read. (A server-side ModelRun.totalCost field resolver was rejected because DbObject selects all fields on every fetch, which would hit model_service on every get_model_run.)

Part of a 3-PR stack (deploy in order)

  1. python-monorepo model_service — populates total_data_rows (Labelbox/python-monorepo#2502)
  2. intelligence — exposes totalDataRows on GraphQL (Labelbox/intelligence#29662)
  3. this PR — SDK properties

Test

tests/unit/test_unit_model_run.py — fetch + cache, refresh re-fetch, and None for non-Foundry runs. 3 tests passing (pytest tests/unit/test_unit_model_run.py).


Note

Low Risk
Additive read-only SDK surface and optional GraphQL round-trip; no changes to auth, persistence, or existing get_model_run behavior.

Overview
Adds lazy, cached Model Foundry cost/usage on ModelRun: total_cost (USD), total_data_rows, and refresh_cost_and_usage() to invalidate the cache.

On first property access, the SDK issues one modelFoundryModelRunInfo GraphQL query (not on get_model_run), stores the result on the instance, and both properties share that fetch. Resource not found and internal server errors are treated as “no Foundry job” and yield None; network/transient errors are not cached and still propagate.

New unit tests cover caching, refresh, missing payloads, non-Foundry runs, and transient error recovery.

Reviewed by Cursor Bugbot for commit d0023b4. Bugbot is set up for automated code reviews on this repo. Configure here.

Expose a model run's total cost and processed data-row count, fetched
in real time from Model Foundry (modelFoundryModelRunInfo) on property
access and cached on the instance -- nothing is persisted on the run.
Returns None for runs not backed by a Foundry model job.

Requires the matching backend changes that surface totalDataRows on
the modelFoundryModelRunInfo GraphQL field.
Comment thread libs/labelbox/src/labelbox/schema/model_run.py Outdated
Catch ResourceNotFoundError/InternalServerError (run has no Foundry
model job) and cache the empty result, but let transient errors
(network, timeout, rate limit) propagate so they are not permanently
cached as None. Addresses Bugbot review feedback.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit ff2aadd. Configure here.

Comment thread libs/labelbox/src/labelbox/schema/model_run.py Outdated
execute() returns None (not raises) for a RESOURCE_NOT_FOUND response
since raise_return_resource_not_found defaults to False, so index the
payload defensively to avoid a TypeError. Addresses Bugbot feedback.

@ChuckTerry ChuckTerry left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM! Also just a super useful addition to the SDK in general :)

@apollonin apollonin merged commit 68a77fe into develop Jun 10, 2026
15 of 25 checks passed
@apollonin apollonin deleted the dapollonin/model-run-cost-and-data-rows branch June 10, 2026 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants