Skip to content

fix:test_sql_length_boundary#35144

Merged
guanshengliang merged 1 commit into3.3.6from
fix/3.3.6/6856960267
Apr 16, 2026
Merged

fix:test_sql_length_boundary#35144
guanshengliang merged 1 commit into3.3.6from
fix/3.3.6/6856960267

Conversation

@Pengrongkun
Copy link
Copy Markdown
Contributor

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

Copilot AI review requested due to automatic review settings April 15, 2026 03:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the SQL-length boundary system test to be more deterministic by avoiding now+offset timestamp expressions and by setting a known maxSQLLength baseline before running boundary checks.

Changes:

  • Set maxSQLLength to 1MB during test case initialization.
  • Use a fixed base timestamp (ms) instead of now+{i}s when generating large INSERT statements.
  • Update expected row counts after the 10MB and 64MB INSERTs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 311 to +319
tdLog.info("=============== Test with large maxSQLLength scenario")
tdSql.execute("alter local 'maxSQLLength' '1048576'")

tdSql.execute("drop database if exists db_maxsql_large")
tdSql.execute("create database db_maxsql_large")
tdSql.execute("use db_maxsql_large")

# Fixed base timestamp (ms) to avoid duplicate ts from now+offset
base_ts_ms = 1704067200000 # 2024-01-01 00:00:00.000
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_sql_length_boundary() is not invoked from run(), and the system-test runner executes only case.run() (it doesn't auto-discover test_* methods). Unless another harness calls this explicitly, these assertions will never run; consider calling self.test_sql_length_boundary() from run() (or moving this test to its own case that is actually scheduled) so the fix takes effect in CI.

Copilot uses AI. Check for mistakes.
Comment on lines 375 to 380
base_len_64mb = len(base_sql_64mb)
tdLog.info(f"64MB sql length: {base_len_64mb}")
num_values_64mb = (target_length_64mb - base_len_64mb) // 16
values_64mb = [f"(now+{i}s,{i})" for i in range(num_values_64mb)]
base_ts_64mb = base_ts_ms + num_values
values_64mb = [f"({base_ts_64mb + i},{i})" for i in range(num_values_64mb)]
insert_sql_64mb = base_sql_64mb + ",".join(values_64mb)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating values_64mb builds a Python list with ~4.2M tuple strings before truncation, which can consume hundreds of MB of memory and significantly slow down or OOM CI. Consider constructing the SQL incrementally (e.g., via io.StringIO / chunked appends) and stop once the target length is reached, instead of materializing all values first.

Copilot uses AI. Check for mistakes.
Comment on lines 366 to +370
# Test with 10MB limit (should succeed)
tdSql.execute("alter local 'maxSQLLength' '10485760'")
tdSql.execute(insert_sql)
tdSql.query("select * from t2")
tdSql.checkRows(509902)
tdSql.checkRows(460732)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected row count is hardcoded, but it’s derived from the generated SQL string length and can easily change if the tuple formatting/target length changes. To make this boundary test less brittle, compute the expected inserted row count from the generated SQL (e.g., count value tuples) or query count(*) before/after the insert and assert on the delta.

Copilot uses AI. Check for mistakes.
Comment on lines 400 to +403
tdSql.execute("alter local 'maxSQLLength' '67108864'")
tdSql.execute(insert_sql_64mb)
tdSql.query("select * from t2")
tdSql.checkRows(3524291)
tdSql.checkRows(3303225)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: this hardcoded total row count depends on the exact SQL string construction and previous insert size, making the test fragile across small formatting changes. Prefer asserting on computed expected counts (tuple count / count(*) delta) rather than fixed constants.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces relative timestamps with fixed millisecond timestamps in boundary tests to avoid duplicates and updates the expected row counts for large SQL inserts. Review feedback points out that the hardcoded divisor used to estimate the number of values is now inaccurate due to the longer timestamp strings. Additionally, the reviewer recommends calculating expected row counts dynamically rather than using hardcoded magic numbers to make the tests more robust.

tdLog.info(f"10MB sql length: {base_len}")
num_values = (target_length - base_len) // 16
values = [f"(now+{i}s,{i})" for i in range(num_values)]
values = [f"({base_ts_ms + i},{i})" for i in range(num_values)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_values used here is calculated (on line 348) using a hardcoded divisor of 16. With the switch to a fixed 13-digit timestamp, the average length of each value tuple (timestamp, value) is now significantly higher (approx 20-23 bytes). This makes the initial estimate for num_values inaccurate, leading to the generation of a much larger SQL string than intended before any trimming logic is applied. Consider updating the divisor to a more realistic value or calculating it dynamically based on the sample row length.

tdSql.execute(insert_sql)
tdSql.query("select * from t2")
tdSql.checkRows(509902)
tdSql.checkRows(460732)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The expected row count 460732 is hardcoded and highly sensitive to the specific string length of the timestamps and values. If the base_ts_ms or the value generation logic changes, this test will break again. It would be more robust to verify the row count dynamically (e.g., by checking against the actual number of elements in the values list after it has been adjusted to fit the target_length) rather than using a magic number.

tdSql.execute(insert_sql_64mb)
tdSql.query("select * from t2")
tdSql.checkRows(3524291)
tdSql.checkRows(3303225)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the 10MB case, the expected row count 3303225 is a hardcoded magic number that makes the test fragile. Consider calculating the expected count dynamically based on the number of rows actually included in the insert_sql_64mb.

@guanshengliang guanshengliang merged commit 8ddcf07 into 3.3.6 Apr 16, 2026
10 of 12 checks passed
@tomchon tomchon deleted the fix/3.3.6/6856960267 branch April 17, 2026 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants