Impact analysis: URL normalization long-term fix
Context
A temporary dependency constraint is being tracked separately to keep requests below 2.34.0 while preserving the security update from 2.33.0.
Related issue:
The reason for this constraint is that requests 2.34.0 changed URL path handling and no longer strips duplicate leading slashes in URI paths.
Official Requests release notes:
This can expose unsafe URL construction patterns where a configured base URL ending with / is combined with a path starting with /.
Example:
base_url = "https://example.com/"
endpoint = base_url + "/graphql"
Result:
https://example.com//graphql
With requests >=2.34.0, this duplicated path may be preserved when the request is sent, which can break servers or routers that treat //graphql differently from /graphql.
Goal
Evaluate the impact of applying the long-term fix: normalize URL construction across the codebase so we no longer rely on requests to clean malformed paths.
This issue is only for analysis and estimation. The implementation can be handled in a follow-up issue if needed.
Scope
Review places where URLs are built manually, especially when a configurable base URL is combined with an application path.
Patterns to check include:
url = base_url + "/path"
url = f"{base_url}/path"
url = "{}/path".format(base_url)
The review should focus on API clients, connector clients, SDK helpers, authentication endpoints, GraphQL endpoints, REST endpoints, health-check endpoints, upload/download URLs, webhook URLs, and callback URLs.
Expected analysis output
The analysis should produce a short summary with:
- affected files or modules
- examples of risky URL construction
- risk level for each area: low / medium / high
- recommended fix strategy
- estimated implementation effort
- tests that should be added or updated
Long-term fix direction
The preferred direction is to centralize URL construction instead of repeating raw string concatenation throughout the codebase.
Python standard URL utilities from urllib.parse should be evaluated for this, especially urljoin or a small wrapper around it.
References:
Example direction:
from urllib.parse import urljoin
def build_url(base_url: str, path: str) -> str:
normalized_base = base_url.rstrip("/") + "/"
normalized_path = path.lstrip("/")
return urljoin(normalized_base, normalized_path)
The analysis should also check whether any path values can be user-controlled, because urljoin must be used carefully with untrusted paths.
Acceptance criteria
Impact analysis: URL normalization long-term fix
Context
A temporary dependency constraint is being tracked separately to keep
requestsbelow2.34.0while preserving the security update from2.33.0.Related issue:
The reason for this constraint is that
requests 2.34.0changed URL path handling and no longer strips duplicate leading slashes in URI paths.Official Requests release notes:
This can expose unsafe URL construction patterns where a configured base URL ending with
/is combined with a path starting with/.Example:
Result:
With
requests >=2.34.0, this duplicated path may be preserved when the request is sent, which can break servers or routers that treat//graphqldifferently from/graphql.Goal
Evaluate the impact of applying the long-term fix: normalize URL construction across the codebase so we no longer rely on
requeststo clean malformed paths.This issue is only for analysis and estimation. The implementation can be handled in a follow-up issue if needed.
Scope
Review places where URLs are built manually, especially when a configurable base URL is combined with an application path.
Patterns to check include:
The review should focus on API clients, connector clients, SDK helpers, authentication endpoints, GraphQL endpoints, REST endpoints, health-check endpoints, upload/download URLs, webhook URLs, and callback URLs.
Expected analysis output
The analysis should produce a short summary with:
Long-term fix direction
The preferred direction is to centralize URL construction instead of repeating raw string concatenation throughout the codebase.
Python standard URL utilities from
urllib.parseshould be evaluated for this, especiallyurljoinor a small wrapper around it.References:
Example direction:
The analysis should also check whether any path values can be user-controlled, because
urljoinmust be used carefully with untrusted paths.Acceptance criteria