Skip to content

Commit ff00e49

Browse files
refactors; improve error handling and messages; update docs
1 parent 87923db commit ff00e49

9 files changed

Lines changed: 149 additions & 35 deletions

File tree

docs/guides/model_selection.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This guide describes how to select specific models to include in a SQLMesh plan, which can be useful when modifying a subset of the models in a SQLMesh project.
44

5-
Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes).
5+
Note: the selector syntax described below is also used for the SQLMesh `plan` [`--allow-destructive-model` selector](../concepts/plans.md#destructive-changes) and for the `table_diff` command to [diff a selection of models](./tablediff.md#diffing-multiple-models-across-environments).
66

77
## Background
88

docs/guides/tablediff.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,77 @@ Under the hood, SQLMesh stores temporary data in the database to perform the com
122122
The default schema for these temporary tables is `sqlmesh_temp` but can be changed with the `--temp-schema` option.
123123
The schema can be specified as a `CATALOG.SCHEMA` or `SCHEMA`.
124124

125+
126+
## Diffing multiple models across environments
127+
128+
SQLMesh allows you to compare multiple models across environments at once using model selection expressions. This is useful when you want to validate changes across a set of related models or the entire project.
129+
130+
To diff multiple models, use the `--select-model` (or `-m` for short) option with the table diff command:
131+
132+
```bash
133+
sqlmesh table_diff prod:dev --select-model "sqlmesh_example.*"
134+
```
135+
136+
When diffing multiple models, SQLMesh will:
137+
138+
1. Show which models exist only in the source environment
139+
2. Show which models exist only in the target environment
140+
3. Show which models exist in both environments but have no differences
141+
4. Compare the models that have differences and display them
142+
143+
The `--select-model` option supports a powerful selection syntax that lets you choose models using patterns, tags, dependencies and git status. For complete details, see the [model selection guide](./model_selection.md).
144+
145+
Here are some common patterns you can use:
146+
147+
### Wildcard Patterns
148+
- `*` - All models in the project
149+
- `sqlmesh_example.*` - All models in the sqlmesh_example schema
150+
151+
### Upstream/Downstream Selection
152+
- `+model_name` - Select the model and its upstream dependencies
153+
- `model_name+` - Select the model and its downstream dependencies
154+
- `+model_name+` - Select the model and both its upstream and downstream dependencies
155+
156+
### Tag-based Selection
157+
- `tag:finance` - All models with the "finance" tag
158+
- `tag:reporting*` - All models with tags starting with "reporting"
159+
160+
### Git-based Selection
161+
- `git:feature` - Select models whose files have changed compared to main branch, including:
162+
- Untracked files (new files not in git)
163+
- Uncommitted changes in working directory
164+
- Committed changes different from main branch
165+
- `+git:feature` - Select changed models and their upstream dependencies
166+
- `git:feature+` - Select changed models and their downstream dependencies
167+
168+
> Note: Git-based selection excludes deleted files and respects your `.gitignore` settings.
169+
170+
### Logical Operators
171+
You can combine multiple selectors using logical operators:
172+
- `&` (AND): Both conditions must be true
173+
- `|` (OR): Either condition must be true
174+
- `^` (NOT): Negates a condition
175+
176+
#### Complex Selection Examples
177+
- `(tag:finance & ^tag:deprecated)` - Models with finance tag that don't have deprecated tag
178+
- `(+model_a | model_b+)` - Model A and its upstream deps OR model B and its downstream deps
179+
- `(tag:finance & git:main)` - Changed models that also have the finance tag
180+
- `^(tag:test) & metrics.*` - Models in metrics schema that don't have the test tag
181+
182+
### Multiple selectors
183+
184+
You can also combine multiple selectors in a single command:
185+
186+
```bash
187+
sqlmesh table_diff prod:dev -m "tag:finance" -m "metrics.*_daily"
188+
```
189+
190+
When multiple selectors are provided, they are combined with OR logic, meaning a model matching any of the selectors will be included.
191+
192+
All the standard table diff options like `--show-sample` work with multiple model diffing. The comparisons are executed concurrently for better performance when dealing with a large of project or many models.
193+
194+
> Note: All models being compared must have their `grain` defined, as this is used to perform the join between the tables in the two environments.
195+
125196
## Diffing tables or views
126197

127198
Compare specific tables or views with the SQLMesh CLI interface by using the command `sqlmesh table_diff [source table]:[target table]`.

docs/reference/cli.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -529,7 +529,7 @@ Options:
529529
```
530530
Usage: sqlmesh table_diff [OPTIONS] SOURCE:TARGET [MODEL]
531531
532-
Show the diff between two tables.
532+
Show the diff between two tables or multiple models across two environments.
533533
534534
Options:
535535
-o, --on TEXT The column to join on. Can be specified multiple
@@ -548,6 +548,7 @@ Options:
548548
--temp-schema TEXT Schema used for temporary tables. It can be
549549
`CATALOG.SCHEMA` or `SCHEMA`. Default:
550550
`sqlmesh_temp`
551+
-m, --select-model TEXT Select specific models to table diff.
551552
--help Show this message and exit.
552553
```
553554

docs/reference/notebook.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,7 @@ Create a schema file containing external model schemas.
293293
%table_diff [--on [ON ...]] [--skip-columns [SKIP_COLUMNS ...]]
294294
[--model MODEL] [--where WHERE] [--limit LIMIT]
295295
[--show-sample] [--decimals DECIMALS] [--skip-grain-check]
296-
[--temp-schema SCHEMA]
296+
[--temp-schema SCHEMA] [--select-model [SELECT_MODEL ...]]
297297
SOURCE:TARGET
298298
299299
Show the diff between two tables.
@@ -320,6 +320,8 @@ options:
320320
--skip-grain-check Disable the check for a primary key (grain) that is
321321
missing or is not unique.
322322
--temp-schema SCHEMA The schema to use for temporary tables.
323+
--select-model <[SELECT_MODEL ...]>
324+
Select specific models to diff using a pattern.
323325
```
324326

325327
#### model

sqlmesh/cli/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -897,7 +897,7 @@ def create_external_models(obj: Context, **kwargs: t.Any) -> None:
897897
"-m",
898898
type=str,
899899
multiple=True,
900-
help="Select specific model that should be diffed.",
900+
help="Specify one or more models to data diff. Use wildcards to diff multiple models. Ex: '*' (all models with applied plan diffs), 'demo.model+' (this and downstream models), 'git:feature_branch' (models with direct modifications in this branch only)",
901901
)
902902
@click.pass_obj
903903
@error_handler

sqlmesh/core/console.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ def start_table_diff_model_progress(self, model: str) -> None:
244244
"""Start table diff model progress"""
245245

246246
@abc.abstractmethod
247-
def stop_table_diff_progress(self) -> None:
247+
def stop_table_diff_progress(self, success: bool) -> None:
248248
"""Stop table diff progress bar"""
249249

250250
@abc.abstractmethod
@@ -714,7 +714,7 @@ def start_table_diff_progress(self, models_to_diff: int) -> None:
714714
def start_table_diff_model_progress(self, model: str) -> None:
715715
pass
716716

717-
def stop_table_diff_progress(self) -> None:
717+
def stop_table_diff_progress(self, success: bool) -> None:
718718
pass
719719

720720
def show_table_diff_details(
@@ -2056,12 +2056,17 @@ def update_table_diff_progress(self, model: str) -> None:
20562056
model_task_id = self.table_diff_model_tasks[model]
20572057
self.table_diff_model_progress.remove_task(model_task_id)
20582058

2059-
def stop_table_diff_progress(self) -> None:
2059+
def stop_table_diff_progress(self, success: bool) -> None:
20602060
if self.table_diff_progress_live:
20612061
self.table_diff_progress_live.stop()
20622062
self.table_diff_progress_live = None
20632063
self.log_status_update("")
2064-
self.log_success(f"{GREEN_CHECK_MARK} Table diff completed")
2064+
2065+
if success:
2066+
self.log_success(f"Table diff completed successfully!")
2067+
else:
2068+
self.log_error("Table diff failed!")
2069+
20652070
self.table_diff_progress = None
20662071
self.table_diff_model_progress = None
20672072
self.table_diff_model_tasks = {}
@@ -2292,6 +2297,8 @@ def show_table_diff(
22922297
fully_matched = []
22932298
for table_diff in table_diffs:
22942299
if (
2300+
table_diff.schema_diff().source_schema == table_diff.schema_diff().target_schema
2301+
) and (
22952302
table_diff.row_diff(
22962303
temp_schema=temp_schema, skip_grain_check=skip_grain_check
22972304
).full_match_pct

sqlmesh/core/context.py

Lines changed: 42 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1612,7 +1612,16 @@ def table_diff(
16121612
if not target_env:
16131613
raise SQLMeshError(f"Could not find environment '{target}'")
16141614

1615-
selected_models = self._new_selector().expand_model_selections(select_models)
1615+
try:
1616+
selected_models = self._new_selector().expand_model_selections(select_models)
1617+
if not selected_models:
1618+
criteria = ", ".join(f"'{c}'" for c in select_models)
1619+
self.console.log_status_update(
1620+
f"No models matched the selection criteria: {criteria}"
1621+
)
1622+
except Exception as e:
1623+
raise SQLMeshError(e)
1624+
16161625
models_to_diff: t.List[
16171626
t.Tuple[Model, EngineAdapter, str, str, t.Optional[t.List[str] | exp.Condition]]
16181627
] = []
@@ -1638,7 +1647,10 @@ def table_diff(
16381647
elif target_snapshot is None and source_snapshot:
16391648
models_in_target.append(model_fqn)
16401649
elif target_snapshot and source_snapshot:
1641-
if source_snapshot.fingerprint != target_snapshot.fingerprint:
1650+
if (
1651+
source_snapshot.fingerprint.data_hash
1652+
!= target_snapshot.fingerprint.data_hash
1653+
):
16421654
# Compare the virtual layer instead of the physical layer because the virtual layer is guaranteed to point
16431655
# to the correct/active snapshot for the model in the specified environment, taking into account things like dev previews
16441656
source = source_snapshot.qualified_view_name.for_environment(
@@ -1668,32 +1680,37 @@ def table_diff(
16681680
)
16691681
raise SQLMeshError(
16701682
f"SQLMesh doesn't know how to join the tables for the following models:\n{model_names}\n"
1671-
"\nPlease specify the `grains` in each model definition."
1683+
"\nPlease specify the `grain` in each model definition. Must be unique and not null."
16721684
)
16731685

16741686
self.console.start_table_diff_progress(len(models_to_diff))
1675-
tasks_num = min(len(models_to_diff), self.concurrent_tasks)
1676-
table_diffs = concurrent_apply_to_values(
1677-
list(models_to_diff),
1678-
lambda model_info: self._model_diff(
1679-
model=model_info[0],
1680-
adapter=model_info[1],
1681-
source=model_info[2],
1682-
target=model_info[3],
1683-
on=model_info[4],
1684-
source_alias=source_env.name,
1685-
target_alias=target_env.name,
1686-
limit=limit,
1687-
decimals=decimals,
1688-
skip_columns=skip_columns,
1689-
where=where,
1690-
show=show,
1691-
temp_schema=temp_schema,
1692-
skip_grain_check=skip_grain_check,
1693-
),
1694-
tasks_num=tasks_num,
1695-
)
1696-
self.console.stop_table_diff_progress()
1687+
try:
1688+
tasks_num = min(len(models_to_diff), self.concurrent_tasks)
1689+
table_diffs = concurrent_apply_to_values(
1690+
list(models_to_diff),
1691+
lambda model_info: self._model_diff(
1692+
model=model_info[0],
1693+
adapter=model_info[1],
1694+
source=model_info[2],
1695+
target=model_info[3],
1696+
on=model_info[4],
1697+
source_alias=source_env.name,
1698+
target_alias=target_env.name,
1699+
limit=limit,
1700+
decimals=decimals,
1701+
skip_columns=skip_columns,
1702+
where=where,
1703+
show=show,
1704+
temp_schema=temp_schema,
1705+
skip_grain_check=skip_grain_check,
1706+
),
1707+
tasks_num=tasks_num,
1708+
)
1709+
self.console.stop_table_diff_progress(success=True)
1710+
except:
1711+
self.console.stop_table_diff_progress(success=False)
1712+
raise
1713+
16971714
else:
16981715
table_diffs = [
16991716
self._table_diff(

sqlmesh/magics.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -690,7 +690,7 @@ def create_external_models(self, context: Context, line: str) -> None:
690690
"--select-model",
691691
type=str,
692692
nargs="*",
693-
help="Select specific model changes that should be included in the plan.",
693+
help="Specify one or more models to data diff. Use wildcards to diff multiple models. Ex: '*' (all models with applied plan diffs), 'demo.model+' (this and downstream models), 'git:feature_branch' (models with direct modifications in this branch only)",
694694
)
695695
@argument(
696696
"--skip-grain-check",

sqlmesh/utils/git.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,23 @@ def _execute_list_output(self, commands: t.List[str], base_path: Path) -> t.List
2727
return [(base_path / o).absolute() for o in self._execute(commands).split("\n") if o]
2828

2929
def _execute(self, commands: t.List[str]) -> str:
30-
result = subprocess.run(["git"] + commands, cwd=self._work_dir, stdout=subprocess.PIPE)
30+
result = subprocess.run(
31+
["git"] + commands,
32+
cwd=self._work_dir,
33+
stdout=subprocess.PIPE,
34+
stderr=subprocess.PIPE,
35+
check=False,
36+
)
37+
38+
# If the Git command failed, extract and raise the error message in the console
39+
if result.returncode != 0:
40+
stderr_output = result.stderr.decode("utf-8").strip()
41+
error_message = next(
42+
(line for line in stderr_output.splitlines() if line.lower().startswith("fatal:")),
43+
stderr_output,
44+
)
45+
raise RuntimeError(f"Git error: {error_message}")
46+
3147
return result.stdout.decode("utf-8").strip()
3248

3349
@cached_property

0 commit comments

Comments
 (0)