Skip to content

fix: support Fixed/UTC±HH:MM:SS timezone format in DateTime and DateTime64#709

Open
shubham1166 wants to merge 3 commits intoClickHouse:mainfrom
shubham1166:fix/fixed-utc-timezone-deserialization
Open

fix: support Fixed/UTC±HH:MM:SS timezone format in DateTime and DateTime64#709
shubham1166 wants to merge 3 commits intoClickHouse:mainfrom
shubham1166:fix/fixed-utc-timezone-deserialization

Conversation

@shubham1166
Copy link
Copy Markdown

Summary

Closes #702

ClickHouse uses a Fixed offset timezone format — e.g. DateTime('Fixed/UTC+05:30:00') — in environments where the IANA timezone database is unavailable on the server. The Python pytz library does not recognise this prefix, causing a hard crash when the client tries to deserialise any DateTime or DateTime64 column that carries a Fixed timezone:

clickhouse_connect.driver.exceptions.InternalError:
  Unrecognized ClickHouse type base: DateTime name: DateTime('Fixed/UTC+05:30:00')

Users cannot query those columns at all — it is a complete failure, not a degraded result.


Root Cause

In temporal.py, both DateTime.__init__ and DateTime64.__init__ pass the raw timezone string directly to pytz.timezone():

# Before
self.tzinfo = pytz.timezone(type_def.values[0][1:-1])

pytz does not know about Fixed/UTC+05:30:00, raises UnknownTimeZoneError, and the registry then re-raises it as an InternalError.


Fix

Added a parse_timezone(tz_str) helper to clickhouse_connect/driver/tzutil.py:

  • Detects the Fixed/UTC±HH:MM:SS pattern with a compiled regex
  • Parses the sign, hours, minutes, and seconds components
  • Returns a stdlib datetime.timezone built from a timedelta
  • All other timezone strings fall through to pytz.timezone() unchanged — zero behaviour change for existing code

Replaced the two direct pytz.timezone() calls in DateTime.__init__ and DateTime64.__init__ with tzutil.parse_timezone().

Total change: 15 lines across 2 source files.


Testing

Unit tests — all passing (tests/unit_tests/test_chtypes.py)

8 new tests were added covering the full surface of parse_timezone() and the two affected type constructors:

Test Verifies
test_parse_timezone_fixed_positive Fixed/UTC+05:30:00 → correct timedelta
test_parse_timezone_fixed_negative Fixed/UTC-03:30:00 → correct negative timedelta
test_parse_timezone_fixed_zero Fixed/UTC+00:00:00 → zero offset
test_parse_timezone_fixed_with_seconds Fixed/UTC+05:30:30 → seconds component parsed
test_parse_timezone_standard_pytz America/Chicago still resolves via pytz
test_datetime_fixed_timezone DateTime('Fixed/UTC+05:30:00') instantiates correctly
test_datetime_fixed_timezone_negative DateTime('Fixed/UTC-03:00:00') instantiates correctly
test_datetime64_fixed_timezone DateTime64(3, 'Fixed/UTC+05:30:00') instantiates correctly

Full unit suite result (excluding pre-existing errors unrelated to this change):

263 passed, 5 warnings  ✅  (+8 new tests vs 255 on main)

Integration tests — all passing against ClickHouse 26.3.9

All 20 existing timezone integration tests pass with no changes:

tests/integration_tests/test_timezones.py  20 passed  ✅

End-to-end proof against a live ClickHouse 26.3.9 instance

The exact query from the issue now works correctly:

# Before fix → InternalError: Unrecognized ClickHouse type base: DateTime
# After fix  → works correctly

client.query("SELECT toDateTime('2024-01-15 10:30:00', 'Fixed/UTC+05:30:00') AS dt")
# Result: 2024-01-15 10:30:00+05:30  ✅

client.query("SELECT toDateTime('2024-01-15 10:30:00', 'Fixed/UTC-03:00:00') AS dt")
# Result: 2024-01-15 10:30:00-03:00  ✅

client.query("SELECT toDateTime64('2024-01-15 10:30:00.123', 3, 'Fixed/UTC+05:30:00') AS dt64")
# Result: 2024-01-15 10:30:00.123000+05:30  ✅

Files Changed

File Change
clickhouse_connect/driver/tzutil.py Added parse_timezone() helper + _FIXED_TZ_RE regex
clickhouse_connect/datatypes/temporal.py Use tzutil.parse_timezone() in DateTime and DateTime64
tests/unit_tests/test_chtypes.py 8 new unit tests

…ime64

ClickHouse uses a Fixed offset timezone format (e.g. Fixed/UTC+05:30:00)
in environments where the IANA timezone database is unavailable. pytz does
not recognise this prefix, causing an InternalError when deserializing any
DateTime or DateTime64 column carrying a Fixed timezone.

Add parse_timezone() to tzutil.py that detects the Fixed/UTC±HH:MM:SS
pattern, parses the sign/hours/minutes/seconds components, and returns a
stdlib datetime.timezone built from a timedelta. All other timezone strings
fall through unchanged to pytz.timezone() so existing behaviour is fully
preserved.

Use parse_timezone() in DateTime.__init__ and DateTime64.__init__ in place
of the direct pytz.timezone() calls.

Fixes ClickHouse#702
@shubham1166 shubham1166 force-pushed the fix/fixed-utc-timezone-deserialization branch from cbc3ee3 to fea647b Compare April 15, 2026 20:48
@mshustov mshustov requested a review from Copilot April 16, 2026 15:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds client-side support for ClickHouse’s Fixed/UTC±HH:MM:SS timezone strings (used when the IANA tzdb isn’t available on the server), so DateTime(...) and DateTime64(...) columns with these schema timezones can be instantiated/deserialized instead of crashing.

Changes:

  • Add parse_timezone() to clickhouse_connect/driver/tzutil.py to detect and parse Fixed/UTC... offsets, falling back to pytz.timezone() otherwise.
  • Update DateTime and DateTime64 type constructors to use tzutil.parse_timezone() instead of calling pytz.timezone() directly.
  • Add unit tests for parse_timezone() and for DateTime/DateTime64 construction with Fixed/UTC... timezones, and document the fix in CHANGELOG.md.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
clickhouse_connect/driver/tzutil.py Introduces parse_timezone() and a regex to recognize ClickHouse Fixed/UTC... timezones.
clickhouse_connect/datatypes/temporal.py Switches DateTime and DateTime64 timezone parsing to the new helper.
tests/unit_tests/test_chtypes.py Adds new unit tests covering fixed-offset parsing and type construction.
CHANGELOG.md Adds an unreleased bug-fix entry describing the new behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit_tests/test_chtypes.py Outdated
Comment on lines +64 to +65
tz = parse_timezone('Fixed/UTC+05:30:30')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30))
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test constructs datetime.timezone(timedelta(..., seconds=30)), but datetime.timezone only supports offsets that are whole minutes (it raises ValueError for non-zero seconds). If ClickHouse always uses ...:SS with SS=00, change the test to reflect that; otherwise the implementation needs a tzinfo that supports second-level offsets.

Suggested change
tz = parse_timezone('Fixed/UTC+05:30:30')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30))
tz = parse_timezone('Fixed/UTC+05:45:00')
assert tz == timezone(timedelta(hours=5, minutes=45))

Copilot uses AI. Check for mistakes.
Comment thread clickhouse_connect/driver/tzutil.py Outdated
import pytz

# Matches ClickHouse Fixed timezone strings like Fixed/UTC+05:30:00 or Fixed/UTC-03:00:00
_FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])(\d{1,2}):(\d{2}):(\d{2})$')
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_FIXED_TZ_RE is too permissive: it allows invalid minutes/seconds (e.g. :99:) and hours beyond the valid UTC offset range, which would either silently normalize to the wrong offset (via timedelta) or raise ValueError from datetime.timezone. Tighten the regex (and/or add explicit range checks) so only valid HH:MM:SS offsets are accepted (minutes/seconds 00-59, hours within the supported range).

Suggested change
_FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])(\d{1,2}):(\d{2}):(\d{2})$')
# Hours must be in the supported UTC offset range (0-23), and minutes/seconds must be 00-59.
_FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])([01]?\d|2[0-3]):([0-5]\d):([0-5]\d)$')

Copilot uses AI. Check for mistakes.
Comment thread tests/unit_tests/test_chtypes.py Outdated
Comment on lines +49 to +71
tz = parse_timezone('Fixed/UTC+05:30:00')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0))


def test_parse_timezone_fixed_negative():
tz = parse_timezone('Fixed/UTC-03:30:00')
assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0))


def test_parse_timezone_fixed_zero():
tz = parse_timezone('Fixed/UTC+00:00:00')
assert tz == timezone(timedelta(0))


def test_parse_timezone_fixed_with_seconds():
tz = parse_timezone('Fixed/UTC+05:30:30')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30))


def test_parse_timezone_standard_pytz():
import pytz
tz = parse_timezone('America/Chicago')
assert tz == pytz.timezone('America/Chicago')
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo formatting enforces double quotes (see pyproject.toml [tool.ruff.format] quote-style = "double"). These new tests use single-quoted strings; please switch to double quotes (and apply consistently across the added tests) to avoid Ruff format/lint churn.

Suggested change
tz = parse_timezone('Fixed/UTC+05:30:00')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0))
def test_parse_timezone_fixed_negative():
tz = parse_timezone('Fixed/UTC-03:30:00')
assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0))
def test_parse_timezone_fixed_zero():
tz = parse_timezone('Fixed/UTC+00:00:00')
assert tz == timezone(timedelta(0))
def test_parse_timezone_fixed_with_seconds():
tz = parse_timezone('Fixed/UTC+05:30:30')
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30))
def test_parse_timezone_standard_pytz():
import pytz
tz = parse_timezone('America/Chicago')
assert tz == pytz.timezone('America/Chicago')
tz = parse_timezone("Fixed/UTC+05:30:00")
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0))
def test_parse_timezone_fixed_negative():
tz = parse_timezone("Fixed/UTC-03:30:00")
assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0))
def test_parse_timezone_fixed_zero():
tz = parse_timezone("Fixed/UTC+00:00:00")
assert tz == timezone(timedelta(0))
def test_parse_timezone_fixed_with_seconds():
tz = parse_timezone("Fixed/UTC+05:30:30")
assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30))
def test_parse_timezone_standard_pytz():
import pytz
tz = parse_timezone("America/Chicago")
assert tz == pytz.timezone("America/Chicago")

Copilot uses AI. Check for mistakes.
Comment thread CHANGELOG.md Outdated
Comment on lines +69 to +75
if match:
sign, hours, minutes, seconds = match.groups()
offset = timedelta(hours=int(hours), minutes=int(minutes), seconds=int(seconds))
if sign == '-':
offset = -offset
return timezone(offset)
return pytz.timezone(tz_str)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datetime.timezone() requires the offset to be a whole number of minutes. As written, parsing a Fixed/UTC±HH:MM:SS value with non-zero seconds will raise ValueError (and the corresponding unit test expects this to work). Either restrict accepted inputs to ...:SS where SS is 00, or use an alternative tzinfo implementation that can represent second-level offsets (and adjust tests accordingly).

Copilot uses AI. Check for mistakes.
…iption

- Tighten _FIXED_TZ_RE to validate hours (0-23) and minutes/seconds (00-59)
  instead of accepting any digit sequence
- Switch string literals in tests to double quotes to match repo's Ruff
  quote-style setting
- Correct the CHANGELOG entry — the actual error is pytz.UnknownTimeZoneError
  raised during type construction, not ValueError
@joe-clickhouse
Copy link
Copy Markdown
Contributor

Hi @shubham1166 thanks for the work so far! There are a few things we still want to tighten up for correctness and completeness here.

Other call sites

There are four other call sites that also parse a clickhouse-sourced timezone string. Your PR description calls out "environments where the IANA timezone database is unavailable on the server." On those servers, SELECT timezone() will return Fixed/UTC+HH:MM:SS and the X-ClickHouse-Timezone header carries the same format on every response. Consequently the following sites hit the same problem:

  1. clickhouse_connect/driver/client.py:203 in _init_common_settings. This is what parses the SELECT timezone() during client init. You'll want to replace pytz.timezone(server_tz) with tzutil.parse_timezone(server_tz).
  2. clickhouse_connect/driver/asyncclient.py:314 you'll need to do the same thing here for the async client.
  3. clickhouse_connect/driver/client.py:299 in _check_tz_change you'll also want to replace pytz.timezone(new_tz) with tzutil.parse_timezone(new_tz).
  4. clickhouse_connect/driver/binding.py:72 in _extract_tz_from_type you'll also need to replace pytz.timezone(v[1:-1]) with tzutil.parse_timezone(v[1:-1])

Unit tests

Then, if you can please add at least 2 unit tests exercising the fixed tz path. i.e. test _extract_tz_from_type with a string like "DateTime('Fixed/UTC+05:30:00')" and then also _check_tz_change with something like "Fixed/UTC+05:30:00" as a string asserting that we end up with a non-UTC fixed-offset tzinfo.

Short circuit Fixed offset tzs in normalize_timezone

Please add an isinstance(tz, datetime.timezone) short-circuit at the top of tz_util.normalize_timezone() returning (tz, True).

Click here for the long explanation Once call site 1 (clickhouse_connect/driver/client.py:203) is handled and _init_common_settings correctly parses the Fixed offset into a stdlib datetime.timezone, it then flows into tzutil.normalize_timezone(). That function checks tz.tzname(None) in pytz.common_timezones to decide _dst_safe, and a stdlib datetime.timezone reports something like "UTC+05:30" which isn't in that set. Consequently, _dst_safe gets set to False, and under the default tz_source="auto" this flips _apply_server_tz=False. So the server's Fixed tz gets correctly parsed but then silently not applied. However, Fixed offsets are by definition DST-safe and self-describing. So we need to tell normalize_timezone() to trust Fixed offsets.

Don't modify

Leave the remaining pytz.timezone() calls in query.py:124,132 and cc_sqlalchemy/datatypes/sqltypes.py:268,286 alone. Those take user-supplied strings where raising on unknown is the correct behavior.

@shubham1166
Copy link
Copy Markdown
Author

Thank you for having a look, i will have a look by tomorrow evening and update it, i hope that is fine on you end. Thanks again.

@joe-clickhouse
Copy link
Copy Markdown
Contributor

Thanks @shubham1166! No problem. Take your time and let me know if you need any clarification or have questions or need help with anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fails to deserialize DateTime('Fixed/UTC±HH:MM:SS') results

3 participants