Skip to content

S3 connection hangs when S3 Put request times out server side. #291

Description

@nesh170

Hi google friends! Apologies for the trouble!! We found this interesting issue when using tensorstore!! Thanks for taking a look

Summary

When an S3 PUT request is timed out server-side, the TensorStore client can hang for ~30 minutes before retrying. During this window, S3 access logs show zero retries — the request appears to be "in flight" on the client but dead on the
server. The receive of the request is confirmed by s3 access logs.

The ~30-minute gap matches the default OS-level TCP keepalive + retransmission timeout, which is the only thing that eventually forces the stalled socket closed.

Root Cause

2 layers of missing timeout configuration compound each other:

1. S3 driver issues requests with no timeouts (s3_key_value_store.cc:764)

auto future = owner->transport_->IssueRequest(
    request, internal_http::IssueRequestOptions(value_));
// IssueRequestOptions only sets payload; request_timeout and connect_timeout
// remain at their defaults of absl::ZeroDuration().
  1. Zero duration → curl sets nothing (curl_transport.cc:198-205)

if (options.request_timeout > absl::ZeroDuration()) { // never true
handle_.SetOption(CURLOPT_TIMEOUT_MS, ...);
}

CURLOPT_TIMEOUT_MS is never set, so libcurl has no deadline on the transfer.

Workaround

Set environment variables before launching your process:

export TENSORSTORE_CURL_LOW_SPEED_TIME_SECONDS=30
export TENSORSTORE_CURL_LOW_SPEED_LIMIT_BYTES=1

This enables the low-speed watchdog: if fewer than 1 byte/sec is transferred for 30 seconds, libcurl aborts the request and the retry logic takes over.

Suggested fix

  • Expose request timeout and connect timeout to s3 query driver

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions