fix: support Fixed/UTC±HH:MM:SS timezone format in DateTime and DateTime64#709
fix: support Fixed/UTC±HH:MM:SS timezone format in DateTime and DateTime64#709shubham1166 wants to merge 3 commits intoClickHouse:mainfrom
Conversation
…ime64 ClickHouse uses a Fixed offset timezone format (e.g. Fixed/UTC+05:30:00) in environments where the IANA timezone database is unavailable. pytz does not recognise this prefix, causing an InternalError when deserializing any DateTime or DateTime64 column carrying a Fixed timezone. Add parse_timezone() to tzutil.py that detects the Fixed/UTC±HH:MM:SS pattern, parses the sign/hours/minutes/seconds components, and returns a stdlib datetime.timezone built from a timedelta. All other timezone strings fall through unchanged to pytz.timezone() so existing behaviour is fully preserved. Use parse_timezone() in DateTime.__init__ and DateTime64.__init__ in place of the direct pytz.timezone() calls. Fixes ClickHouse#702
cbc3ee3 to
fea647b
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds client-side support for ClickHouse’s Fixed/UTC±HH:MM:SS timezone strings (used when the IANA tzdb isn’t available on the server), so DateTime(...) and DateTime64(...) columns with these schema timezones can be instantiated/deserialized instead of crashing.
Changes:
- Add
parse_timezone()toclickhouse_connect/driver/tzutil.pyto detect and parseFixed/UTC...offsets, falling back topytz.timezone()otherwise. - Update
DateTimeandDateTime64type constructors to usetzutil.parse_timezone()instead of callingpytz.timezone()directly. - Add unit tests for
parse_timezone()and forDateTime/DateTime64construction withFixed/UTC...timezones, and document the fix inCHANGELOG.md.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
clickhouse_connect/driver/tzutil.py |
Introduces parse_timezone() and a regex to recognize ClickHouse Fixed/UTC... timezones. |
clickhouse_connect/datatypes/temporal.py |
Switches DateTime and DateTime64 timezone parsing to the new helper. |
tests/unit_tests/test_chtypes.py |
Adds new unit tests covering fixed-offset parsing and type construction. |
CHANGELOG.md |
Adds an unreleased bug-fix entry describing the new behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tz = parse_timezone('Fixed/UTC+05:30:30') | ||
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30)) |
There was a problem hiding this comment.
This test constructs datetime.timezone(timedelta(..., seconds=30)), but datetime.timezone only supports offsets that are whole minutes (it raises ValueError for non-zero seconds). If ClickHouse always uses ...:SS with SS=00, change the test to reflect that; otherwise the implementation needs a tzinfo that supports second-level offsets.
| tz = parse_timezone('Fixed/UTC+05:30:30') | |
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30)) | |
| tz = parse_timezone('Fixed/UTC+05:45:00') | |
| assert tz == timezone(timedelta(hours=5, minutes=45)) |
| import pytz | ||
|
|
||
| # Matches ClickHouse Fixed timezone strings like Fixed/UTC+05:30:00 or Fixed/UTC-03:00:00 | ||
| _FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])(\d{1,2}):(\d{2}):(\d{2})$') |
There was a problem hiding this comment.
_FIXED_TZ_RE is too permissive: it allows invalid minutes/seconds (e.g. :99:) and hours beyond the valid UTC offset range, which would either silently normalize to the wrong offset (via timedelta) or raise ValueError from datetime.timezone. Tighten the regex (and/or add explicit range checks) so only valid HH:MM:SS offsets are accepted (minutes/seconds 00-59, hours within the supported range).
| _FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])(\d{1,2}):(\d{2}):(\d{2})$') | |
| # Hours must be in the supported UTC offset range (0-23), and minutes/seconds must be 00-59. | |
| _FIXED_TZ_RE = re.compile(r'^Fixed/UTC([+-])([01]?\d|2[0-3]):([0-5]\d):([0-5]\d)$') |
| tz = parse_timezone('Fixed/UTC+05:30:00') | ||
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0)) | ||
|
|
||
|
|
||
| def test_parse_timezone_fixed_negative(): | ||
| tz = parse_timezone('Fixed/UTC-03:30:00') | ||
| assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0)) | ||
|
|
||
|
|
||
| def test_parse_timezone_fixed_zero(): | ||
| tz = parse_timezone('Fixed/UTC+00:00:00') | ||
| assert tz == timezone(timedelta(0)) | ||
|
|
||
|
|
||
| def test_parse_timezone_fixed_with_seconds(): | ||
| tz = parse_timezone('Fixed/UTC+05:30:30') | ||
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30)) | ||
|
|
||
|
|
||
| def test_parse_timezone_standard_pytz(): | ||
| import pytz | ||
| tz = parse_timezone('America/Chicago') | ||
| assert tz == pytz.timezone('America/Chicago') |
There was a problem hiding this comment.
Repo formatting enforces double quotes (see pyproject.toml [tool.ruff.format] quote-style = "double"). These new tests use single-quoted strings; please switch to double quotes (and apply consistently across the added tests) to avoid Ruff format/lint churn.
| tz = parse_timezone('Fixed/UTC+05:30:00') | |
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0)) | |
| def test_parse_timezone_fixed_negative(): | |
| tz = parse_timezone('Fixed/UTC-03:30:00') | |
| assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0)) | |
| def test_parse_timezone_fixed_zero(): | |
| tz = parse_timezone('Fixed/UTC+00:00:00') | |
| assert tz == timezone(timedelta(0)) | |
| def test_parse_timezone_fixed_with_seconds(): | |
| tz = parse_timezone('Fixed/UTC+05:30:30') | |
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30)) | |
| def test_parse_timezone_standard_pytz(): | |
| import pytz | |
| tz = parse_timezone('America/Chicago') | |
| assert tz == pytz.timezone('America/Chicago') | |
| tz = parse_timezone("Fixed/UTC+05:30:00") | |
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=0)) | |
| def test_parse_timezone_fixed_negative(): | |
| tz = parse_timezone("Fixed/UTC-03:30:00") | |
| assert tz == timezone(timedelta(hours=-3, minutes=-30, seconds=0)) | |
| def test_parse_timezone_fixed_zero(): | |
| tz = parse_timezone("Fixed/UTC+00:00:00") | |
| assert tz == timezone(timedelta(0)) | |
| def test_parse_timezone_fixed_with_seconds(): | |
| tz = parse_timezone("Fixed/UTC+05:30:30") | |
| assert tz == timezone(timedelta(hours=5, minutes=30, seconds=30)) | |
| def test_parse_timezone_standard_pytz(): | |
| import pytz | |
| tz = parse_timezone("America/Chicago") | |
| assert tz == pytz.timezone("America/Chicago") |
| if match: | ||
| sign, hours, minutes, seconds = match.groups() | ||
| offset = timedelta(hours=int(hours), minutes=int(minutes), seconds=int(seconds)) | ||
| if sign == '-': | ||
| offset = -offset | ||
| return timezone(offset) | ||
| return pytz.timezone(tz_str) |
There was a problem hiding this comment.
datetime.timezone() requires the offset to be a whole number of minutes. As written, parsing a Fixed/UTC±HH:MM:SS value with non-zero seconds will raise ValueError (and the corresponding unit test expects this to work). Either restrict accepted inputs to ...:SS where SS is 00, or use an alternative tzinfo implementation that can represent second-level offsets (and adjust tests accordingly).
…iption - Tighten _FIXED_TZ_RE to validate hours (0-23) and minutes/seconds (00-59) instead of accepting any digit sequence - Switch string literals in tests to double quotes to match repo's Ruff quote-style setting - Correct the CHANGELOG entry — the actual error is pytz.UnknownTimeZoneError raised during type construction, not ValueError
|
Hi @shubham1166 thanks for the work so far! There are a few things we still want to tighten up for correctness and completeness here. Other call sitesThere are four other call sites that also parse a clickhouse-sourced timezone string. Your PR description calls out "environments where the IANA timezone database is unavailable on the server." On those servers,
Unit testsThen, if you can please add at least 2 unit tests exercising the fixed tz path. i.e. test Short circuit Fixed offset tzs in
|
|
Thank you for having a look, i will have a look by tomorrow evening and update it, i hope that is fine on you end. Thanks again. |
|
Thanks @shubham1166! No problem. Take your time and let me know if you need any clarification or have questions or need help with anything. |
Summary
Closes #702
ClickHouse uses a
Fixedoffset timezone format — e.g.DateTime('Fixed/UTC+05:30:00')— in environments where the IANA timezone database is unavailable on the server. The Pythonpytzlibrary does not recognise this prefix, causing a hard crash when the client tries to deserialise anyDateTimeorDateTime64column that carries aFixedtimezone:Users cannot query those columns at all — it is a complete failure, not a degraded result.
Root Cause
In
temporal.py, bothDateTime.__init__andDateTime64.__init__pass the raw timezone string directly topytz.timezone():pytzdoes not know aboutFixed/UTC+05:30:00, raisesUnknownTimeZoneError, and the registry then re-raises it as anInternalError.Fix
Added a
parse_timezone(tz_str)helper toclickhouse_connect/driver/tzutil.py:Fixed/UTC±HH:MM:SSpattern with a compiled regexdatetime.timezonebuilt from atimedeltapytz.timezone()unchanged — zero behaviour change for existing codeReplaced the two direct
pytz.timezone()calls inDateTime.__init__andDateTime64.__init__withtzutil.parse_timezone().Total change: 15 lines across 2 source files.
Testing
Unit tests — all passing (
tests/unit_tests/test_chtypes.py)8 new tests were added covering the full surface of
parse_timezone()and the two affected type constructors:test_parse_timezone_fixed_positiveFixed/UTC+05:30:00→ correcttimedeltatest_parse_timezone_fixed_negativeFixed/UTC-03:30:00→ correct negativetimedeltatest_parse_timezone_fixed_zeroFixed/UTC+00:00:00→ zero offsettest_parse_timezone_fixed_with_secondsFixed/UTC+05:30:30→ seconds component parsedtest_parse_timezone_standard_pytzAmerica/Chicagostill resolves via pytztest_datetime_fixed_timezoneDateTime('Fixed/UTC+05:30:00')instantiates correctlytest_datetime_fixed_timezone_negativeDateTime('Fixed/UTC-03:00:00')instantiates correctlytest_datetime64_fixed_timezoneDateTime64(3, 'Fixed/UTC+05:30:00')instantiates correctlyFull unit suite result (excluding pre-existing errors unrelated to this change):
Integration tests — all passing against ClickHouse 26.3.9
All 20 existing timezone integration tests pass with no changes:
End-to-end proof against a live ClickHouse 26.3.9 instance
The exact query from the issue now works correctly:
Files Changed
clickhouse_connect/driver/tzutil.pyparse_timezone()helper +_FIXED_TZ_REregexclickhouse_connect/datatypes/temporal.pytzutil.parse_timezone()inDateTimeandDateTime64tests/unit_tests/test_chtypes.py