Skip to content

fix(sdk): default initial protocol version to 10 when unpinned (upgrade-safe ratchet floor)#3809

Merged
lklimek merged 11 commits into
v3.1-devfrom
fix/sdk-default-initial-protocol-version
Jun 12, 2026
Merged

fix(sdk): default initial protocol version to 10 when unpinned (upgrade-safe ratchet floor)#3809
lklimek merged 11 commits into
v3.1-devfrom
fix/sdk-default-initial-protocol-version

Conversation

@Claudius-Maginificent

@Claudius-Maginificent Claudius-Maginificent commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Issue being fixed or feature implemented

Problem: The SDK can already auto-detect the network's protocol version and follow it — but that behaviour was opt-in. You had to know to call with_initial_version() to turn it on, and almost nobody did. So an unpinned SDK fell back to PlatformVersion::latest(). During an upgrade window that is higher than the network, and because version auto-update only ratchets upward, a too-high SDK can never come back down — it stays stuck shipping a wire format the network rejects.

Fix: Make auto-detect the default. An unpinned SDK now seeds from a safe floor (DEFAULT_INITIAL_PROTOCOL_VERSION, currently 10) and ratchets up to whatever the network reports. No special call required.

Opt out: Callers who want a fixed version call with_version(...), which pins the version and disables auto-detect. Explicit configuration behaves exactly as before.

What was done?

  • Added DEFAULT_INITIAL_PROTOCOL_VERSION (currently 10) — a bumpable, upgrade-safe floor.
  • SdkBuilder::default() now seeds the protocol version from this floor instead of latest(), so auto-detect is on by default.
  • Removed with_initial_version() from the public API (now a #[cfg(test)] test-only seed). Auto-detect being the default makes the old opt-in redundant; with_version() is the sole public way to pin + opt out.
  • Verified the auto-detected protocol_version is cryptographically trustworthy: it is bound to the quorum BLS threshold signature via the Tenderdash StateId.app_version, and the version ratchet runs only after proof + signature verification succeeds. Added a guard comment at the verify-before-ratchet call site plus a regression test asserting the ratchet rejects unknown / non-upward inputs.
  • Count/group_by mock tests (which need a fixed V1-documents wire version) now pin via with_version(PV12) instead of the now-hidden override.
  • Docs aligned with the new contract: auto-detect is the default; with_version() = pin + opt out.

Scope: packages/rs-sdk (the change + its tests).

How Has This Been Tested?

  • cargo test -p dash-sdk --lib — passing, including new tests for the floor default, upward-only ratchet, and explicit-pin override.
  • cargo clippy -p dash-sdk -p rs-sdk-ffi --all-targets --locked -- -D warnings — clean.
  • cargo test -p dash-sdk --doc and cargo fmt — clean.

Breaking Changes

For end users: none — only the unpinned default changes (boots at the floor and ratchets up instead of starting at latest); pinning via with_version() is unchanged. For SDK source callers: with_initial_version() is no longer public (it was the rarely-used opt-in auto-detect seed, now the default). This API-surface reduction lands within the v3.1-dev pre-release cycle; migrate any such caller to the default (auto-detect) or with_version() to pin.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have added "!" to the title and described breaking changes in the corresponding section if my code contains any
  • I have made corresponding changes to the documentation if needed

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

🤖 Co-authored by Claudius the Magnificent AI Agent

…de-safe ratchet floor)

When the caller pins neither an explicit PlatformVersion (with_version) nor an
explicit initial version (with_initial_version), SdkBuilder previously seeded the
protocol version to PlatformVersion::latest(). Combined with the upward-only
ratchet in maybe_update_protocol_version (early-return when received <= current,
then fetch_max), an unpinned SDK booted ABOVE a network still running an older
protocol version (e.g. 10/11 during an upgrade window) and could never come back
down — leaving it permanently stuck too high and shipping a too-new wire format.
Autodetect could never fix it because it only ratchets up.

Seed the default initial protocol version from a new bumpable constant
DEFAULT_INITIAL_PROTOCOL_VERSION = 10 instead of latest(). The atomic
protocol_version (the source read by version(), query_settings(), and proof
parsing) is seeded from the builder's version field, so changing the default
field is the single effective change point. Ratchet-up autodetection then
converges upward (10 -> whatever the network reports), and starting low keeps
requests compatible with not-yet-upgraded nodes during an upgrade window.
Explicit with_version / with_initial_version overwrite the field, so pinned
config is unaffected. Bump the constant when the network's supported floor moves.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds exported DEFAULT_INITIAL_PROTOCOL_VERSION and changes SDK bootstrapping so unpinned instances seed per-SDK protocol version from that floor (not PlatformVersion::latest()), with updated docs, FFI comment, tests for upward-only auto-detect ratcheting, and test mock adjustments for pinned PlatformVersions.

Changes

Protocol Version Initialization Floor

Layer / File(s) Summary
Constant and documentation updates
packages/rs-sdk/src/sdk.rs, packages/rs-sdk-ffi/src/types.rs
Adds DEFAULT_INITIAL_PROTOCOL_VERSION and updates proof-parsing, Sdk::version(), SdkBuilder::with_version, and DashSDKConfig.platform_version docs to describe the new fallback floor and auto-detect ratcheting semantics.
SdkBuilder default seeding, tests, and mock updates
packages/rs-sdk/src/sdk.rs, packages/rs-sdk/tests/fetch/document_count.rs, packages/rs-sdk/tests/fetch/document_query_v0_v1.rs, packages/rs-sdk/tests/fetch/mock_fetch.rs
SdkBuilder::default() now seeds the per-instance protocol-version atomic from DEFAULT_INITIAL_PROTOCOL_VERSION (with runtime expect). Adds unit tests for default seeding, upward-only ratcheting, and pinning behavior; updates mock test helpers and test initializations to pin compatible PlatformVersions where needed.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • dashpay/platform#3751: Adds FFI platform_version forwarded into SdkBuilder::with_version, aligning with this PR's protocol-version bootstrapping and pinning semantics.

Suggested reviewers

  • shumkov
  • QuantumExplorer

Poem

🐰 I planted a floor where versions may start,
A tiny hop to keep each instance smart.
It listens, learns, then climbs up high,
Never falling back — just aiming for sky.
Tests cheer softly as the numbers try.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main change: setting the default initial protocol version to 10 when unpinned, with mention of upgrade-safe ratcheting behavior.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sdk-default-initial-protocol-version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…rm-version V10 constant

Replace the magic literal `10` in DEFAULT_INITIAL_PROTOCOL_VERSION with the
predefined `dpp::version::v10::PROTOCOL_VERSION_10` constant
(`pub const PROTOCOL_VERSION_10: ProtocolVersion = 10`, where
`ProtocolVersion = u32`). It is a true compile-time const, so it remains valid in
our `const u32` initializer. The local DEFAULT_INITIAL_PROTOCOL_VERSION name is
kept as the bumpable knob; only the source of the value changes. No behavioral
change. Edits only packages/rs-sdk.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lklimek lklimek marked this pull request as ready for review June 8, 2026 10:31
@thepastaclaw

thepastaclaw commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

✅ Review complete (commit 892365d)

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The PR seeds unpinned SDKs at PV10 instead of latest() so they can ratchet upward across an upgrade window. The change itself is small, well-tested, and explicit pins are unaffected. However, the new default introduces a real first-request regression for SDK surfaces that require PV12-only document_query dispatch (e.g. DocumentCount), because the wire is encoded synchronously from sdk.query_settings() before any metadata is observed, and encode_v0 rejects non-Documents selects locally. There is also a test that couples floor + 2 to PlatformVersion::get returning Ok, which silently regresses the moment the floor constant is bumped.

🔴 1 blocking | 🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/sdk.rs`:
- [BLOCKING] packages/rs-sdk/src/sdk.rs:771-772: Default PV10 seed breaks first PV12-only queries before auto-detect can ratchet
  Seeding every unpinned SDK at PV10 means the first request is encoded with PV10 because `Fetch::fetch_with_metadata_and_proof` (packages/rs-sdk/src/platform/fetch.rs:177) reads `sdk.query_settings()` (sdk.rs:520, returning `self.version()`) and runs the `DocumentQuery -> GetDocumentsRequest` encoder synchronously, while `maybe_update_protocol_version` only fires after response metadata is verified. With PV10, `drive_abci.query.document_query.default_current_version = 0` (drive_abci_query_versions/v0.rs:23), so `encode_v0` rejects any `select` other than `Documents` with a local `Error::Config` at document_query.rs:508. Concretely, `DocumentCount::fetch` (which requires `SelectProjection::count_star()` via `assert_select_is_count`) cannot make a first call against the current PV12 network on an unpinned SDK: it fails before reaching the node, so auto-detect never sees a metadata response to ratchet on, and the caller must either pin via `with_version`/`with_initial_version` or first issue some unrelated ratcheting query. `group_by` / `having` paths fail the same way. The fix needs to either (a) pre-ratchet via a cheap version probe before encoding, (b) gate the V0-only encode on a confirmed-old network rather than the seed alone, or (c) document the requirement to pin when using V1-only document_query surfaces.
- [SUGGESTION] packages/rs-sdk/src/sdk.rs:1709-1715: Ratchet test uses `floor + 2` and will silently break when the floor is bumped
  `test_default_floor_ratchets_up_but_never_down` exercises the ratchet by calling `sdk.maybe_update_protocol_version(floor + 2)` and asserting the atomic moved to `floor + 2`. Today this works because `floor=10` and `PlatformVersion::latest()=PV12`, so `floor + 2 = 12` is a known version. The doc comment on `DEFAULT_INITIAL_PROTOCOL_VERSION` (sdk.rs:53-58) explicitly invites bumping the floor as the network's supported floor advances. The moment the floor is raised to 11, `floor + 2 = 13` is unknown; `maybe_update_protocol_version` hits the `PlatformVersion::get(received_version).is_err()` guard at sdk.rs:328, silently keeps the seed, and the assertion fails with a message that points at the ratchet rather than the actual cause. Anchor the ratchet target to a known version greater than the floor (e.g. `PlatformVersion::latest().protocol_version` with a precondition `assert!(latest > floor)`) instead of arithmetic on the floor.

Comment thread packages/rs-sdk/src/sdk.rs
Comment thread packages/rs-sdk/src/sdk.rs Outdated
@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.82%. Comparing base (f4ed60f) to head (892365d).
⚠️ Report is 18 commits behind head on v3.1-dev.

Additional details and impacted files
@@             Coverage Diff              @@
##           v3.1-dev    #3809      +/-   ##
============================================
+ Coverage     70.73%   70.82%   +0.09%     
============================================
  Files            20       20              
  Lines          2788     2797       +9     
============================================
+ Hits           1972     1981       +9     
  Misses          816      816              
Components Coverage Δ
dpp ∅ <ø> (∅)
drive ∅ <ø> (∅)
drive-abci ∅ <ø> (∅)
sdk ∅ <ø> (∅)
dapi-client ∅ <ø> (∅)
platform-version ∅ <ø> (∅)
platform-value ∅ <ø> (∅)
platform-wallet ∅ <ø> (∅)
drive-proof-verifier ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…t test; doc count workaround

The PV10 default floor (V0 documents wire) made several mock tests panic in the
local query encoder before any network call, because they exercise v3.1+-only
semantics:

- fetch::document_count::* (7 tests): Count / group_by require the V1 wire.
  Each now builds its mock SDK via count_capable_mock_sdk(), which pins the
  initial version to the hardcoded PV12 constant
  (dpp::version::v12::PROTOCOL_VERSION_12 — the first release wiring
  DRIVE_ABCI_QUERY_VERSIONS_V1) using with_initial_version (keeps auto-detect).
- fetch::mock_fetch::test_mock_fetch_data_contract: mock_data_contract builds V2
  document types; an unpinned SDK decodes the round-trip at the V0 floor and
  downgrades them to V1. Pinned to PV12 the same way.
- fetch::document_query_v0_v1::sdk_builder_with_initial_version_seeds_atomic_without_pinning:
  its default-seed assertion expected latest(); updated to assert the new floor
  DEFAULT_INITIAL_PROTOCOL_VERSION.

Hardened test_default_floor_ratchets_up_but_never_down: the ratchet target is
now the hardcoded PV12 constant with a precondition asserting it exceeds the
floor, so the test stays correct (and only accepts known versions) when the
floor advances — replacing the brittle floor + 2 arithmetic.

Documented on DEFAULT_INITIAL_PROTOCOL_VERSION the two ways to use the
v3.1+-only query surfaces (Count / group_by / having) on an unpinned SDK:
(a) with_initial_version at build time, or (b) issue one ratcheting query right
after build() (e.g. ExtendedEpochInfo::fetch_current) to lift the protocol
version to the network's actual version. Includes a compiling no_run example.

No default-seed change, no proactive probe; scope is packages/rs-sdk only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Both prior nitpicks remain STILL VALID at head 2f45522 — the latest delta did not touch the docstring framing in either location. The delta also introduces a new minor docs issue: the warm-up doctest comment shows a stale SdkBuilder::new_mainnet(addrs) signature even though new_mainnet is now parameterless. No blocking, security, or consensus issues introduced by this PR; the PV10 floor + upward ratchet logic is sound and the test pinning is appropriate.

💬 4 nitpick(s)

2 additional finding(s) omitted (not in diff).

Comment thread packages/rs-sdk/src/sdk.rs Outdated
Comment thread packages/rs-sdk/src/sdk.rs Outdated

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Latest push (8cea833) is a merge of v3.1-dev carrying the unrelated shielded-fees PR #3793; the in-scope SDK delta (DEFAULT_INITIAL_PROTOCOL_VERSION=PV10 + test pinning) is unchanged since 2f45522. All four prior documentation nitpicks on packages/rs-sdk/src/sdk.rs are STILL VALID at HEAD and carried forward. One new convergent suggestion: state-transition broadcast/wait does not ratchet from response metadata before proof verification, so a fresh unpinned SDK now starts at the PV10 floor when verifying the first transition proof against a potentially-newer network.

🟡 1 suggestion(s) | 💬 2 nitpick(s)

3 additional finding(s) omitted (not in diff).

2 carried-forward finding(s) already raised on this PR; not re-posting as new inline comments.

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/platform/transition/broadcast.rs`:
- [SUGGESTION] packages/rs-sdk/src/platform/transition/broadcast.rs:182-188: broadcast_and_wait verifies first-call proof at the PV10 floor without ratcheting from response metadata
  Confirmed at broadcast.rs:182-188: `Drive::verify_state_transition_was_executed_with_proof(..., sdk.version())` is called directly with `sdk.version()` and there is no `sdk.verify_response_metadata(metadata)` call earlier in this wait path to ratchet from `metadata.protocol_version` before proof verification. After this PR, an unpinned SDK seeds at `DEFAULT_INITIAL_PROTOCOL_VERSION` (PV10) instead of `latest()`. If a freshly built, unpinned SDK's very first operation is a `broadcast_and_wait` on a v3.1+ network, the transition-result proof is verified against PV10 while the network proof shape may correspond to a newer PV — this is the same first-call bootstrap window that sdk.rs:394-408 documents for query proof parsing, but the broadcast/wait path has no equivalent ratchet hook before `verify_state_transition_was_executed_with_proof`. Either ratchet from `metadata.protocol_version` before proof verification (mirroring `verify_response_metadata`), or extend the bootstrap-limitation doc to explicitly cover broadcast_and_wait and recommend `with_initial_version` for callers whose first op is a transition.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

✅ DashSDKFFI.xcframework built for this PR.

SwiftPM (host the zip at a stable URL, then use):

.binaryTarget(
  name: "DashSDKFFI",
  url: "https://your.cdn.example/DashSDKFFI.xcframework.zip",
  checksum: "5cc2d670d5ef08f4edfc64917e24a06f2a7b412fe52e31134a272e07785a410b"
)

Xcode manual integration:

  • Download 'DashSDKFFI.xcframework' artifact from the run link above.
  • Drag it into your app target (Frameworks, Libraries & Embedded Content) and set Embed & Sign.
  • If using the Swift wrapper package, point its binaryTarget to the xcframework location or add the package and place the xcframework at the expected path.

Comments/docs only — zero behavior change. Tightens the verbose inline
and doc comments PR #3809 introduced, per the "fewer tokens, same signal"
convention.

- DEFAULT_INITIAL_PROTOCOL_VERSION rustdoc: trim the no_run doctest to a
  minimal compiling snippet and keep the load-bearing facts (floor < latest
  + bump note; the V0-floor Error::Config rejection of Count/group_by/having
  before any network call; the two ratchet options).
- State the v3.1+ query-surface caveat ONCE in the constant; reduce the
  document_count module header and count_capable_mock_sdk helper doc to
  one-line pointers.
- Collapse the new sdk.rs unit-test comments and the three test-file inline
  comments to single tight sentences, preserving the PV12-as-fixed-known-
  target and V2-type-decode rationale.

Resolves two thepastaclaw review nitpicks in the same pass:
- Delete the stale `new_mainnet(addrs)` commented alternative in the doctest
  (new_mainnet/new_testnet are now parameterless).
- Replace the overstated "any normal query works at the floor" with the
  precise "one floor-compatible ratcheting query (no v3.1+ surfaces)".

cargo fmt, cargo test --doc, and cargo build --tests for dash-sdk all pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lklimek lklimek added the ready for final review Ready for the final review. If AI was involved in producing this PR, it has already had a reviewer. label Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/rs-sdk/src/sdk.rs`:
- Around line 64-71: The comment hard-codes "PV10, V0 documents wire" in the SDK
docs; update the paragraph to refer to the default floor constant instead of the
literal PV10 so the text stays correct when the floor is bumped. Find the doc
block around the local encoder note (mentions Error::Config, Count /
SelectProjection::count_star, group_by, having) and replace the literal version
text with a reference to the SDK's default floor identifier (or a descriptive
phrase like "the default floor version") and keep the examples and references to
SdkBuilder::with_initial_version and ExtendedEpochInfo::fetch_current unchanged.
- Around line 57-69: Update the public docs to match the new bootstrap contract:
change the docstring for SdkBuilder::with_initial_version /
with_initial_version() (currently claiming fallback stays at
PlatformVersion::latest() until first response) to describe the new behavior
where the SDK starts below the network version and ratchets up via
maybe_update_protocol_version (i.e., an explicit initial floor may be set and
the SDK will converge to the network version), and also update the FFI docs in
packages/rs-sdk-ffi/src/types.rs (where platform_version = 0 is documented as
“auto-detect / latest”) to reflect the new semantics of platform_version /
initial_version for auto-detection and the bootstrap floor; ensure wording
mentions that v3.1+ surfaces are rejected locally until the ratchet advances and
that users can pin a higher initial version with with_initial_version().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6b40b1d-c587-4279-bf67-35d053dd9912

📥 Commits

Reviewing files that changed from the base of the PR and between 2f45522 and 5b28a78.

📒 Files selected for processing (4)
  • packages/rs-sdk/src/sdk.rs
  • packages/rs-sdk/tests/fetch/document_count.rs
  • packages/rs-sdk/tests/fetch/document_query_v0_v1.rs
  • packages/rs-sdk/tests/fetch/mock_fetch.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • packages/rs-sdk/tests/fetch/document_query_v0_v1.rs
  • packages/rs-sdk/tests/fetch/mock_fetch.rs
  • packages/rs-sdk/tests/fetch/document_count.rs

Comment thread packages/rs-sdk/src/sdk.rs Outdated
Comment thread packages/rs-sdk/src/sdk.rs Outdated
lklimek and others added 2 commits June 10, 2026 10:07
Reword DEFAULT_INITIAL_PROTOCOL_VERSION, with_initial_version, and the
rs-sdk-ffi platform_version field docs to describe the floor relative to
the default initial protocol version rather than baking in a literal PV
number that goes stale on the next floor bump. No behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread packages/rs-sdk/tests/fetch/document_count.rs Outdated
Comment thread packages/rs-sdk/src/sdk.rs
lklimek and others added 2 commits June 10, 2026 16:42
…a with_version(); add ratchet-ordering guard

Auto-detect is the default now, so with_initial_version() is no longer
the public way to enable it. Demote it to a test-only seed
(#[cfg(test)] pub(crate)) and repoint user-facing docs at with_version()
(the opt-out that pins + disables auto-detect).

The count mock tests short-circuit the wire verifier, so auto-detect
never ratchets there — switch document_count.rs and mock_fetch.rs from
with_initial_version() to with_version() to pin the fixed PV12 wire
shape, and update the docs to say "pins" not "seeds".

Add a guard comment at the verify_response_metadata/ratchet call site
documenting the verify-before-ratchet security invariant, plus a unit
test asserting the ratchet rejects unknown/zero/non-upward versions
(the full tampered-signed-proof path is documented as out of unit-test
scope).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Three prior findings (#1 suggestion, #2 and #3 nitpicks) are STILL VALID at head e9f9323 and carried forward unchanged. Prior nitpicks #4 (stale warm-up doctest) and #5 (overbroad floor-compatibility claim) are FIXED in the latest delta. The agents converged on all three carried-forward items; no new defects were introduced by the most recent push.

🟡 1 suggestion(s) | 💬 2 nitpick(s)

3 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/platform/transition/broadcast.rs`:
- [SUGGESTION] packages/rs-sdk/src/platform/transition/broadcast.rs:182-188: broadcast_and_wait verifies the first execution proof at the PV10 floor without ratcheting from response metadata
  At broadcast.rs:153-188, `metadata` is extracted (line 154) and used only for `block_info`. The proof is then passed to `Drive::verify_state_transition_was_executed_with_proof(..., sdk.version())` with no prior `sdk.verify_response_metadata(metadata)` call (or equivalent) to feed `metadata.protocol_version` into `maybe_update_protocol_version`. Because this PR makes the default unpinned seed `DEFAULT_INITIAL_PROTOCOL_VERSION` (PV10), a fresh auto-detect SDK whose first network interaction is a state-transition broadcast will verify the execution proof at the PV10 floor even when the response advertises a newer network protocol. If proof-verification dispatch differs between PV10 and the network's actual PV, the first broadcast-and-wait may spuriously fail before the SDK ever ratchets — the same bootstrap risk this PR's `parse_proof_with_metadata_and_proof` doc calls out, but applied to a path that has no ratchet step. Ratcheting from `metadata.protocol_version` before computing the verifier's version would close the gap and align this path with the documented bootstrap contract.

@lklimek

lklimek commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@shumkov — re: your question on whether the auto-detected protocol version is signed and correctly verified ("I think it should be part of the StateID"): investigated it, and the answer is yes — it already is. Verdict: VERIFIED.

ResponseMetadata.protocol_version is cryptographically bound to the quorum's BLS threshold signature via the Tenderdash StateId.app_version field. A malicious or MITM DAPI node cannot forge it — a tampered value produces an invalid signature, which fails verification and aborts the response before the version ratchet ever runs.

Trust-model trace (file:line)

  • rs-drive-proof-verifier/src/verify.rs:39,56metadata.protocol_version is read into StateId.app_version; that StateId is hashed (:63-67), embedded in the CanonicalVote (:75), and the BLS sign-digest is verified against the quorum public key (:114, returns Err(InvalidSignature) on mismatch).
  • tenderdash-abci v1.5.1 signatures.rs:271-279StateId::sign_bytes prost-encodes the entire StateId (including app_version), then SHA-256s it. Primitive-level proof the field is genuinely inside the signed preimage. Version pinned by git tag + commit in Cargo.lock:7305-7307.
  • rs-drive-proof-verifier/src/proof.rs:316 → :318 — every FromProof impl calls verify_tenderdash_proof(proof, mtd, &root_hash, …)? and then returns mtd.clone(). The metadata that gets verified is the metadata that gets returned and ratcheted — no swap window.
  • rs-sdk/src/sdk.rs:421 before :423 — the ratchet (maybe_update_protocol_version) runs only after proof + signature verification ?-succeeds. The only production callers are fetch.rs:204 / fetch_many.rs:248; there is no unverified path that writes the version.

Threat model

Forging protocol_version changes the StateID hash → BLS verification fails → the response aborts before the ratchet. To succeed, an attacker needs the quorum threshold private key (i.e. consensus compromise), which is outside the DAPI-node boundary. Even hypothetically, the known-version + upward-only fetch_max guards cap the blast radius at DoS — never acceptance of forged data. Severity: INFO — a positive finding; the mechanism is sound by construction.

Conclusion

The gate is satisfied ("if verification will succeed, make with_initial_version() non-pub"), so it's done — pushed in bcf3b14:

  • with_initial_version() is no longer public (now a #[cfg(test)] test-only seed); auto-detect is the default and with_version() is the documented opt-out.
  • The count mock tests that needed a fixed wire version now pin via with_version(PV12) instead of the (now-hidden) override.
  • Added a guard comment at the verify-before-ratchet call site plus a regression test asserting the version ratchet rejects unknown / non-upward inputs, locking the "verify before ratchet" ordering into an enforced invariant rather than a convention.

🤖 Co-authored by Claudius the Magnificent AI Agent

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Incremental verification against bcf3b14. Prior finding #2 (with_initial_version doc) is FIXED — with_initial_version is now #[cfg(test)] pub(crate) with rewritten docs. Prior findings #1 (broadcast.rs verify-before-ratchet missing) and #3 (parse_proof failure-mode paragraph still references latest()) are STILL VALID at current head and are carried forward. No genuinely new in-scope defects found in the e9f9323..bcf3b14 delta: the shielded operations.rs doc nit belongs to merged PR #3843, and the Swift/memo FFI findings are pre-existing behavior unrelated to this PR's auto-detect floor change.

Carried-forward prior findings: #1 remains valid in packages/rs-sdk/src/platform/transition/broadcast.rs:153-188: the broadcast wait path verifies the first execution proof using sdk.version() but never consumes response metadata to ratchet the SDK afterward. #3 remains valid in packages/rs-sdk/src/sdk.rs:385-393: the parse_proof_with_metadata_and_proof bootstrap note still describes the old latest()-vs-older-network failure direction instead of the new PV10-floor-vs-newer-network direction. Prior finding #2 is fixed.

🟡 1 suggestion(s) | 💬 1 nitpick(s)

2 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/platform/transition/broadcast.rs`:
- [SUGGESTION] packages/rs-sdk/src/platform/transition/broadcast.rs:153-188: broadcast_and_wait never ratchets protocol_version from response metadata
  This is the same shape of bug this PR fixes for query proofs, but the broadcast wait path was missed. At lines 153-160 the response `metadata` is extracted and consumed only to build `block_info`. The execution proof is then verified at lines 182-188 with `sdk.version()` and the function returns — nowhere on this path does `sdk.verify_response_metadata(method_name, &metadata)` (or any other entry into `maybe_update_protocol_version`) get called. A grep of broadcast.rs confirms neither symbol appears.

  For an unpinned SDK whose first traffic is a broadcast — exactly the workflow the new auto-detect default is supposed to make upgrade-safe — `sdk.version()` is permanently pinned at `DEFAULT_INITIAL_PROTOCOL_VERSION` (PV10). On a PV12 network the very first execution proof is verified against the wrong wire format and there is no ratchet path to ever self-correct; mixed workflows hide this only because an earlier fetch ratcheted the SDK first.

  Mirror the sequencing now codified at sdk.rs:421-425: call `sdk.verify_response_metadata(method_name, &metadata)` *after* `Drive::verify_state_transition_was_executed_with_proof` returns `Ok`, so the trusted metadata is consumed only once the proof+signature gate has passed. This preserves the verify-before-ratchet security invariant while extending the PR's auto-detect contract to broadcast-only callers.

…instead of with_version pin

Pure-mock tests that need the v3.1+/PV12 wire (Count / group_by in
document_count.rs, V2 document types in mock_fetch.rs) used to hard-pin
PV12 with `SdkBuilder::with_version(pv)`. Since the rs-sdk default boots
at the PV10 floor and ratchets up only on proven response metadata, these
tests now converge the same way production does: a cheap proven
`ExtendedEpochInfo::fetch_current` ratchets the auto-detect SDK to latest
before the real request encodes.

To make the ratchet actually fire in mock mode, the mock's
`parse_proof_with_metadata` now reports `LATEST_VERSION` for expectation
hits instead of `ResponseMetadata::default()` (protocol_version = 0, which
the ratchet ignores) — honestly simulating a latest-version network.

A shared `bootstrap_mock_sdk_to_latest` helper in tests/fetch/common.rs
registers and consumes the proven epoch expectation, asserting the SDK
ends up at LATEST_VERSION. Both call sites reuse it.

No production change: it already ratchets on real proven metadata. Full
rs-sdk mock suite (main: 131 passed/4 ignored, lib: 139 passed) stays
green with no other test needing adjustment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lklimek lklimek requested a review from thepastaclaw June 12, 2026 13:59

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Incremental delta (bcf3b14..438ce90) is test-infrastructure only (mock metadata reports LATEST_VERSION + bootstrap helper replacing with_version(PV12) pins via proven fetch_current) and introduces no new defects. Both prior findings reconcile as STILL VALID at current head and are carried forward: (A1) broadcast_and_wait still verifies the execution proof with the cached sdk.version() and never invokes verify_response_metadata, so the auto-detect ratchet this PR establishes for the query path is absent on the write path; (A2) the parse_proof_with_metadata_and_proof failure-mode paragraph still describes the pre-PR latest()-based direction, inconsistent with the rewritten lead-in that names DEFAULT_INITIAL_PROTOCOL_VERSION as the floor.

🟡 1 suggestion(s) | 💬 1 nitpick(s)

2 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/platform/transition/broadcast.rs`:
- [SUGGESTION] packages/rs-sdk/src/platform/transition/broadcast.rs:153-209: broadcast_and_wait never ratchets protocol_version from response metadata (carried forward, STILL VALID)
  Verified at HEAD 438ce907 — unchanged. The response `metadata` is extracted at lines 154-157 and consumed solely to build `block_info` (lines 158-160). The execution proof is then verified at lines 182-188 with `sdk.version()` and `context_provider.as_contract_lookup_fn(sdk.version())`, with no call to `Sdk::verify_response_metadata` or `maybe_update_protocol_version` anywhere on this path (confirmed by reading lines 140-220 and absence of the symbol in this file).

  This is the same shape of bug the PR fixes for the query path in `Sdk::parse_proof_with_metadata_and_proof` (sdk.rs:422-429), but the broadcast wait path was not updated. With this PR's new default — unpinned SDKs seed at `DEFAULT_INITIAL_PROTOCOL_VERSION` (currently 10) and ratchet upward only via proven query metadata — a caller whose first network interaction is `broadcast_and_wait` against a newer-version network will (a) verify the execution proof with the floor version (risking proof misinterpretation when wire formats diverge across versions), and (b) never learn the network version from this response, so the next call still starts from the floor.

  Fix must respect the PR's documented security invariant (sdk.rs:422-425): proof+signature verification (`Drive::verify_state_transition_was_executed_with_proof`'s `Ok(_)` arm) must precede the ratchet so the metadata being consumed is already trusted. Concretely: after the existing match at 182-208 yields `Ok`, call `sdk.verify_response_metadata(request.method_name(), &metadata)` and propagate through `ExecutionResult`, mirroring the query path. Reordering the ratchet before the proof check would break the invariant.

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

PR #3809 makes auto-detect the SDK default by seeding unpinned SDKs from DEFAULT_INITIAL_PROTOCOL_VERSION and removing with_initial_version() from the public API. The latest delta 438ce90..892365d is overwhelmingly an upstream merge of v3.1-dev bringing in unrelated already-merged PRs (#3811, #3863, #3865, #3862, etc.); no new in-scope defects were introduced. Both prior findings remain STILL VALID at HEAD: broadcast_and_wait still does not ratchet protocol_version from verified response metadata, and the sdk.rs bootstrap-limitation paragraph still describes the pre-PR latest()-based failure direction.

🟡 1 suggestion(s) | 💬 1 nitpick(s)

2 additional finding(s) omitted (not in diff).

🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `packages/rs-sdk/src/platform/transition/broadcast.rs`:
- [SUGGESTION] packages/rs-sdk/src/platform/transition/broadcast.rs:153-209: broadcast_and_wait never ratchets protocol_version from verified response metadata
  Carried forward — STILL VALID at HEAD 892365d9. The wait path extracts `metadata` at lines 154-157 and consumes it solely to build `block_info` (158-160). The execution proof is then verified at lines 182-188 using `sdk.version()` and `context_provider.as_contract_lookup_fn(sdk.version())`, with no call to `Sdk::verify_response_metadata` or `maybe_update_protocol_version` anywhere on this path. The only callers of `verify_response_metadata` outside tests are on the query path (`parse_proof_with_metadata_and_proof` at sdk.rs:426).

  With this PR seeding unpinned SDKs at `DEFAULT_INITIAL_PROTOCOL_VERSION` (PV10) instead of `latest()`, the broadcast path becomes a real bootstrap hazard: a first `broadcast_and_wait` against a newer-version network will verify/interpret the execution proof with the floor version and never learn the network's actual version from response metadata. A broadcast-only client (e.g., a wallet whose first call is a submit) never converges, while query-side calls correctly ratchet. The right fix is symmetrical to the query path: after the proof verifies successfully (around line 211), call `sdk.verify_response_metadata(method_name, &metadata)` to perform the same upward-only `maybe_update_protocol_version` against the trusted, signature-bound metadata. The verify-before-ratchet ordering the PR's regression test enforces must be preserved.

@lklimek lklimek merged commit 5504415 into v3.1-dev Jun 12, 2026
3 checks passed
@lklimek lklimek deleted the fix/sdk-default-initial-protocol-version branch June 12, 2026 15:12
@lklimek

lklimek commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Write path will be addresses in a separate follow-up PR.

shumkov added a commit that referenced this pull request Jun 16, 2026
…ectation

#3886 (2026-06-14) raised the per-network protocol floors
(min_protocol_version: Mainnet=11, others=12). The build-time clamp applies
them as max(DEFAULT_INITIAL_PROTOCOL_VERSION, min_protocol_version(network)),
which silently invalidated this test (added earlier by #3809): it still
asserted the unpinned mock SDK seeds to DEFAULT_INITIAL_PROTOCOL_VERSION (10),
but new_mock() defaults to Mainnet, so the seed is now 11. v3.1-dev has been
red on this test since #3886; the DashPay merge pulled the breakage in.

Assert the SDK's documented post-clamp contract instead
(max(DEFAULT_INITIAL_PROTOCOL_VERSION, PROTOCOL_VERSION_11) == 11). Verified
locally: left:11 right:10 ✖ before, 1 passed ✔ after.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
shumkov added a commit that referenced this pull request Jun 18, 2026
…ectation

#3886 (2026-06-14) raised the per-network protocol floors
(min_protocol_version: Mainnet=11, others=12). The build-time clamp applies
them as max(DEFAULT_INITIAL_PROTOCOL_VERSION, min_protocol_version(network)),
which silently invalidated this test (added earlier by #3809): it still
asserted the unpinned mock SDK seeds to DEFAULT_INITIAL_PROTOCOL_VERSION (10),
but new_mock() defaults to Mainnet, so the seed is now 11. v3.1-dev has been
red on this test since #3886; the DashPay merge pulled the breakage in.

Assert the SDK's documented post-clamp contract instead
(max(DEFAULT_INITIAL_PROTOCOL_VERSION, PROTOCOL_VERSION_11) == 11). Verified
locally: left:11 right:10 ✖ before, 1 passed ✔ after.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready for final review Ready for the final review. If AI was involved in producing this PR, it has already had a reviewer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants