Skip to content

fix(security): 2 improvements across 2 files#711

Closed
tomaioo wants to merge 4 commits intoClickHouse:mainfrom
tomaioo:fix/security/sql-injection-risk-in-sqlalchemy-inspect
Closed

fix(security): 2 improvements across 2 files#711
tomaioo wants to merge 4 commits intoClickHouse:mainfrom
tomaioo:fix/security/sql-injection-risk-in-sqlalchemy-inspect

Conversation

@tomaioo
Copy link
Copy Markdown

@tomaioo tomaioo commented Apr 17, 2026

Summary

fix(security): 2 improvements across 2 files

Problem

Severity: High | File: clickhouse_connect/cc_sqlalchemy/inspector.py:L15

The get_engine function builds SQL using f-string interpolation with schema and table_name directly inside quoted literals. If these values are attacker-controlled (or come from untrusted input through reflection APIs), a crafted value containing quotes can alter the query.

Solution

Use parameterized SQL instead of string interpolation, e.g. text("SELECT engine_full FROM system.tables WHERE database = :schema AND name = :name") with bound params. Also validate schema/table identifiers against a strict allowlist pattern.

Changes

  • clickhouse_connect/cc_sqlalchemy/inspector.py (modified)
  • clickhouse_connect/cc_sqlalchemy/ddl/custom.py (modified)

tomaioo added 2 commits April 16, 2026 17:07
- Security: SQL injection risk in SQLAlchemy inspector metadata query
- Security: DDL injection risk in CreateDatabase replicated engine arguments

Signed-off-by: tomaioo <203048277+tomaioo@users.noreply.github.com>
- Security: SQL injection risk in SQLAlchemy inspector metadata query
- Security: DDL injection risk in CreateDatabase replicated engine arguments

Signed-off-by: tomaioo <203048277+tomaioo@users.noreply.github.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 17, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ joe-clickhouse
❌ tomaioo
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to harden SQL construction in the SQLAlchemy integration by reducing SQL-injection risk when querying ClickHouse system metadata and by escaping user-provided values embedded into DDL strings.

Changes:

  • Replace f-string SQL interpolation with bound parameters in get_engine when querying system.tables.
  • Add string escaping for Replicated database engine arguments in CreateDatabase.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
clickhouse_connect/cc_sqlalchemy/inspector.py Uses bind parameters for the system.tables lookup in get_engine.
clickhouse_connect/cc_sqlalchemy/ddl/custom.py Introduces a helper to escape string arguments embedded in Replicated database DDL construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4 to +8
from clickhouse_connect.driver.binding import quote_identifier


def _escape_sql_string(value):
return str(value).replace("\\", "\\\\").replace("'", "\\'")
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_escape_sql_string duplicates escaping logic that already exists in clickhouse_connect.driver.binding.escape_str/format_str and currently escapes fewer characters (e.g., it doesn't handle tabs/newlines/backticks that the driver escape logic treats as special). To keep escaping consistent across the codebase and avoid edge-case parsing issues, consider reusing the driver-provided escape/format helpers instead of maintaining a separate implementation here.

Suggested change
from clickhouse_connect.driver.binding import quote_identifier
def _escape_sql_string(value):
return str(value).replace("\\", "\\\\").replace("'", "\\'")
from clickhouse_connect.driver.binding import escape_str, quote_identifier
def _escape_sql_string(value):
return escape_str(str(value))

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +16
result_set = connection.execute(text("SELECT engine_full FROM system.tables WHERE database = :schema and name = :table_name"),
{'schema': schema, 'table_name': table_name})
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions adding strict allowlist validation for schema/table identifiers in addition to parameterizing the query, but this change only adds bind parameters. Either implement the identifier validation (if still required) or update the PR description to match the actual mitigation being shipped.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +44
stmt += (
f" ('{_escape_sql_string(zoo_path)}', "
f"'{_escape_sql_string(shard_name)}', "
f"'{_escape_sql_string(replica_name)}'"
)
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters how Replicated database engine arguments are rendered/escaped, but current test coverage only exercises CreateDatabase for non-Replicated engines. Add a focused test that validates the generated SQL for engine='Replicated' (including escaping of quotes/backslashes in the args) to prevent regressions and ensure the security fix stays intact.

Copilot uses AI. Check for mistakes.
stmt += (
f" ('{_escape_sql_string(zoo_path)}', "
f"'{_escape_sql_string(shard_name)}', "
f"'{_escape_sql_string(replica_name)}'"
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Replicated database DDL string being built here still appears to be missing the closing ")" after the replica_name argument. As written, the final statement would end with ..., '{replica}' and leave the opening parenthesis from Replicated ( unbalanced, which should produce invalid SQL. Add the closing parenthesis (and any expected spacing) when appending these arguments.

Suggested change
f"'{_escape_sql_string(replica_name)}'"
f"'{_escape_sql_string(replica_name)}')"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing ) here exists on main, and is indeed a needed fix that we should do here since we're touching this line anyway. It also means no test exercises this path so we should get a test around this as well. Again, using the existing format_str

stmt += (
    f" ({format_str(zoo_path)}, "
    f"{format_str(shard_name)}, "
    f"{format_str(replica_name)})"
)

Copy link
Copy Markdown
Contributor

@joe-clickhouse joe-clickhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tomaioo thanks for the fixes. Left a few comments in the code for your review.

stmt += (
f" ('{_escape_sql_string(zoo_path)}', "
f"'{_escape_sql_string(shard_name)}', "
f"'{_escape_sql_string(replica_name)}'"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing ) here exists on main, and is indeed a needed fix that we should do here since we're touching this line anyway. It also means no test exercises this path so we should get a test around this as well. Again, using the existing format_str

stmt += (
    f" ({format_str(zoo_path)}, "
    f"{format_str(shard_name)}, "
    f"{format_str(replica_name)})"
)

Comment on lines +7 to +8
def _escape_sql_string(value):
return str(value).replace("\\", "\\\\").replace("'", "\\'")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codebase already has format_str in binding.py. So let's delete _escape_sql_string entirely and just reuse those to match existing convention.

…nd is indeed a needed

Signed-off-by: tomaioo <203048277+tomaioo@users.noreply.github.com>
@joe-clickhouse
Copy link
Copy Markdown
Contributor

To get this wrapped up, I'm closing this PR but the work is reimplemented in #728 thanks @tomaioo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants