Skip to content

fix(safe): retry Safe API transport errors instead of crashing#275

Merged
spalen0 merged 1 commit into
mainfrom
fix/safe-api-connection-retry
Jun 16, 2026
Merged

fix(safe): retry Safe API transport errors instead of crashing#275
spalen0 merged 1 commit into
mainfrom
fix/safe-api-connection-retry

Conversation

@spalen0

@spalen0 spalen0 commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

What

get_safe_transactions() in protocols/safe/main.py retried on HTTP 429/5xx status codes but let transport-level failures (ConnectionResetError, read timeouts, DNS) propagate uncaught. The requests.get also had no timeout.

A reset on the Safe Transaction Service (api.safe.global) therefore crashed the whole safe monitor with a bare requests.exceptions.ConnectionError, surfacing as:

[yearn] 🚨 main crashed: ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

The sibling get_safe_current_nonce() already wraps its call in try/except + timeout=10 and degrades gracefully; this brings get_safe_transactions() in line.

How

  • Wrap the requests.get in the existing max_retries backoff loop, catching requests.exceptions.RequestException (covers connection reset / timeout / DNS) and retrying with the same exponential backoff used for 429/5xx.
  • Add timeout=10, matching get_safe_current_nonce().
  • On exhausting retries it falls through to the existing return [] instead of aborting the run.

Why it matters

The safe task is long-running (often 150–230s, paginated across 6 networks), so a mid-run socket reset against api.safe.global is an expected transient. It should retry, not page the alerts channel and skip the rest of the multisig sweep.

Testing

  • uv run pytest tests/476 passed, 4 skipped
  • uv run ruff check . / ruff format --check → clean
  • New tests in tests/test_safe_main.py:
    • retries on ConnectionError then succeeds (2 calls, 1 backoff sleep)
    • returns [] after exhausting max_retries

🤖 Generated with Claude Code

get_safe_transactions() retried on HTTP 429/5xx but let transport-level
failures (connection reset, read timeout, DNS) propagate uncaught,
crashing the safe monitor with a bare requests.ConnectionError that
surfaces as a "[yearn] main crashed" Telegram alert.

Wrap the request in the existing backoff/retry loop and add a 10s
timeout, mirroring get_safe_current_nonce()'s graceful handling. After
exhausting retries it returns [] (existing fall-through) instead of
aborting the whole run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@spalen0 spalen0 force-pushed the fix/safe-api-connection-retry branch from 73fd434 to 5fbd73a Compare June 16, 2026 19:30
@spalen0 spalen0 marked this pull request as ready for review June 16, 2026 19:31
@spalen0 spalen0 merged commit 96d4863 into main Jun 16, 2026
2 checks passed
@spalen0 spalen0 deleted the fix/safe-api-connection-retry branch June 16, 2026 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant