Fix: Normalize when_matched and merge_filter expressions to the source dialect#4847
Fix: Normalize when_matched and merge_filter expressions to the source dialect#4847
Conversation
| WHEN MATCHED AND __MERGE_SOURCE__._operation = 1 THEN DELETE | ||
| WHEN MATCHED AND __MERGE_SOURCE__._operation <> 1 THEN UPDATE SET | ||
| __MERGE_TARGET__.purchase_order_id = 1 | ||
| WHEN MATCHED AND __merge_source__._operation = 1 THEN DELETE |
There was a problem hiding this comment.
Our tests tend to use dialects where a normalized identifier is lowercased, so fixing this issue required a bunch of edits to test SQL
| v = t.cast(exp.Whens, v.transform(d.replace_merge_table_aliases, dialect=dialect)) | ||
|
|
||
| return t.cast(exp.Whens, v.transform(d.replace_merge_table_aliases)) | ||
| return normalize_identifiers(v, dialect=dialect) |
There was a problem hiding this comment.
If we do it here, then wouldn't this require a migration? Since this changes how the expressions are stored in the state. Don't we need a migration script and to make this a breaking change?
There was a problem hiding this comment.
Also note, that in other places (eg. partitioned_by, time column, etc) we quote identifiers in addition to normalizing them.
There was a problem hiding this comment.
I've updated this to normalize, quote and attach .meta["dialect"] to the returned expressions so that when they get serialized to JSON for state, they are serialized according to the correct dialect.
I re-used the _get_field private function from sqlmesh.utils.pydantic which did all of this already but added a public function to expose it
There was a problem hiding this comment.
I also added an empty migration
9c5f889 to
5c40e28
Compare
|
|
||
| model = SqlModel.parse_raw(model.json()) | ||
| assert model.kind.when_matched.sql() == expected_when_matched | ||
| assert model.kind.when_matched.sql(dialect="hive") == expected_when_matched |
There was a problem hiding this comment.
Dialects started becoming very relevant for .sql() calls in many tests now that we quote identifiers
| source.ds > @start_ds AND | ||
| source._operation <> 1 AND | ||
| target.start_date > dateadd(day, -7, current_date) | ||
| target.start_date > date_add(current_date, interval 7 day) |
There was a problem hiding this comment.
dateadd(day, -7, current_date) isnt actually a duckdb function. Due to this, day was getting treated as a column "day" which is incorrect.
So I changed this to the correct syntax
5c40e28 to
ed7b3cf
Compare
Fixes #4843
Currently, neither the
when_matchedormerge_filteruser-supplied expressions are normalized. This can lead to issues when queries are executed, because we quote by default.So an expression in a Model definition like:
Would get executed as:
which is incorrect for databases like Snowflake as unquoted identifiers should be normalized to uppercase like so:
This PR normalizes the user input for the
when_matchedandmerge_filterproperties ofIncrementalByUniqueKeyKindas well as normalizing theMERGE_SOURCE_ALIASandMERGE_TARGET_ALIASconstants so they are consistent with the user query snippet