Skip to content

Commit 76b08ca

Browse files
committed
Feat: state dump/load
1 parent a151747 commit 76b08ca

16 files changed

Lines changed: 1670 additions & 1 deletion

File tree

docs/concepts/state.md

Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
# State
2+
3+
SQLMesh stores information about your project in a state database separate from your main warehouse.
4+
5+
The SQLMesh state database contains:
6+
7+
- Information about every [Model Version](./models/overview.md) in your project (query, loaded intervals, dependencies)
8+
- A list of every [Virtual Data Environment](./environments.md) in the project
9+
- Which model versions are [promoted](./plans.md#plan-application) into each [Virtual Data Environment](./environments.md)
10+
- Information about any [auto restatements](./models/overview.md#auto_restatement_cron) present in your project
11+
- Other metadata about your project such as current SQLMesh / SQLGlot version
12+
13+
The state database is how SQLMesh "remembers" what it's done before so it can compute a minimum set of operations to apply changes instead of rebuilding everything every time. It's also how SQLMesh tracks what historical data has already been backfilled for [incremental models](./models/model_kinds.md#incremental_by_time_range) so you dont need to add branching logic into the model query to handle this.
14+
15+
!!! info "State database performance"
16+
17+
The workload against the state database is an OLTP workload that requires transaction support in order to work correctly.
18+
19+
For the best experience, we recommend [Tobiko Cloud](../cloud/cloud_index.md) or databases designed for OLTP workloads such as [PostgreSQL](../integrations/engines/postgres.md).
20+
21+
Using your warehouse OLAP database to store state is supported for proof-of-concept projects but is not suitable for production and **will** lead to poor performance and consistency.
22+
23+
For more information on supported state databases, see the [configuration guide](../guides/configuration.md#state-connection).
24+
25+
## Dumping / Loading State
26+
27+
SQLMesh supports dumping the state database to a `.json` file. From there, you can inspect the file with any tool that can read text files. You can also pass the file around and load it back in to a SQLMesh project running elsewhere.
28+
29+
### Dumping state
30+
31+
SQLMesh can dump the state database to a file like so:
32+
33+
```bash
34+
$ sqlmesh state dump -o state.json
35+
Dumping state to 'state.json' from the following connection:
36+
37+
Gateway: dev
38+
State Connection:
39+
├── Type: duckdb
40+
├── Catalog: state
41+
└── Dialect: duckdb
42+
43+
Continue? [y/n]: y
44+
45+
Dumping versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 4/4 • 0:00:00
46+
Dumping snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
47+
Dumping environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
48+
Dumping auto restatements ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% • 0/0 • 0:00:00
49+
50+
State dumped successfully to 'state.json'
51+
```
52+
53+
This will produce a file `state.json` in the current directory containing the SQLMesh state.
54+
55+
The state file is a simple `json` file that looks like:
56+
57+
```json
58+
{
59+
/* UTC timestamp of when the file was produced */
60+
"timestamp": "2025-03-16 23:09:00+00:00",
61+
/* Library versions used to produce this state dump file */
62+
"versions": {
63+
"schema_version": 76 /* sqlmesh state database schema version */,
64+
"sqlglot_version": "26.10.1" /* version of SQLGlot used to produce the state file */,
65+
"sqlmesh_version": "0.165.1" /* version of SQLMesh used to produce the state file */,
66+
"state_dump_version": 1 /* state dump file format version */
67+
},
68+
/* array of objects containing every Snapshot (physical table) tracked by the SQLMesh project */
69+
"snapshots": [
70+
{ "name": "..." }
71+
],
72+
/* object for every Virtual Data Environment in the project. key = environment name, value = environment details */
73+
"environments": {
74+
"prod": {
75+
"..."
76+
}
77+
},
78+
/* array of objects describing any Auto Restatements present in the project */
79+
"auto_restatements": [
80+
{ "name": "..." }
81+
]
82+
}
83+
```
84+
85+
### Loading state
86+
87+
!!! warning "Back up your state database first!"
88+
89+
Please ensure you have created an independent backup of your state database in case something goes wrong during the state load.
90+
91+
SQLMesh tries to wrap the state load in a transaction but some database engines do not support transactions against DDL which means
92+
a load error has the potential to leave the state database in an inconsistent state.
93+
94+
SQLMesh can load a state file into the state database like so:
95+
96+
```bash
97+
$ sqlmesh state load -i state.json
98+
Loading state from 'state.json' into the following connection:
99+
100+
Gateway: migration
101+
State Connection:
102+
├── Type: duckdb
103+
├── Catalog: state-migration
104+
└── Dialect: duckdb
105+
106+
[WARNING] This destructive operation will delete all existing state against the 'migration' gateway
107+
and replace it with what\'s in the 'state.json' file.
108+
Are you sure? [y/n]: y
109+
110+
State File Information:
111+
├── Creation Timestamp: 2025-03-26 03:26:00+00:00
112+
├── File Version: 1
113+
├── SQLMesh version: 0.165.1
114+
├── SQLMesh migration version: 76
115+
└── SQLGlot version: 26.11.1
116+
117+
Loading versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
118+
Loading snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17/17 • 0:00:00
119+
Loading environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
120+
Loading auto restatements ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
121+
122+
State loaded successfully from 'state.json'
123+
```
124+
125+
Note that the state database structure needs to be present and up to date, so run `sqlmesh migrate` before running `sqlmesh state load` if you get a version mismatch error.
126+
127+
### Specific gateways
128+
129+
If your project has [multiple gateways](../guides/configuration.md#gateways) with different state connections per gateway, you can target the [state_connection](../guides/configuration.md#state-connection) of a specific gateway like so:
130+
131+
```bash
132+
# state dump
133+
$ sqlmesh --gateway <gateway> state dump -o state.json
134+
135+
# state load
136+
$ sqlmesh --gateway <gateway> state load -i state.json
137+
```
138+
139+
## Version Compatibility
140+
141+
When loading state, the state file must have been produced with the same major and minor version of SQLMesh that is being used to load it.
142+
143+
If you attempt to load state with an incompatible version, you will get the following error:
144+
145+
```bash
146+
$ sqlmesh state load -i state.json
147+
...SNIP...
148+
149+
State load failed!
150+
Error: SQLMesh version mismatch. You are running '0.165.1' but the state file was created with '0.164.1'.
151+
Please upgrade/downgrade your SQLMesh version to match the state file before performing the import.
152+
```
153+
154+
### Upgrading a state file
155+
156+
You can upgrade a state file produced by an old SQLMesh version to be compatible with a newer SQLMesh version by:
157+
158+
- Loading it into a local database using the older SQLMesh version
159+
- Installing the newer SQLMesh version
160+
- Running `sqlmesh migrate` to upgrade the state within the local database
161+
- Running `sqlmesh state dump` to dump it back out again. The new dump is now compatible with the newer version of SQLMesh.
162+
163+
Below is an example of how to upgrade a state file created with SQLMesh `0.164.1` to be compatible with SQLMesh `0.165.1`.
164+
165+
First, create and activate a virtual environment to isolate the SQLMesh versions from your main environment:
166+
167+
```bash
168+
$ python -m venv migration-env
169+
170+
$ . ./migration-env/bin/activate
171+
172+
(migration-env)$
173+
```
174+
175+
Install the SQLMesh version compatible with your state file. The correct version to use is printed in the error message, eg `the state file was created with '0.164.1'` means you need to install SQLMesh `0.164.1`:
176+
177+
```bash
178+
(migration-env)$ pip install "sqlmesh==0.164.1"
179+
```
180+
181+
Add a gateway to your `config.yaml` like so:
182+
183+
```yaml
184+
gateways:
185+
migration:
186+
connection:
187+
type: duckdb
188+
database: ./state-migration.duckdb
189+
```
190+
191+
The goal here is to define just enough config for SQLMesh to be able to use a local database to run the state dump/load commands. SQLMesh still needs to inherit things like the `model_defaults` from your project in order to migrate state correctly which is why we have not used an isolated directory.
192+
193+
!!! warning
194+
195+
From here on, be sure to specify `--gateway migration` to all SQLMesh commands or you run the risk of accidentally clobbering any state on your main gateway
196+
197+
You can now load your state dump using the same version of SQLMesh it was created with:
198+
199+
```bash
200+
(migration-env)$ sqlmesh --gateway migration migrate
201+
202+
(migration-env)$ sqlmesh --gateway migration state load -i state.json
203+
...
204+
State loaded successfully from 'state.json'
205+
```
206+
207+
Now we have the state loaded, we can upgrade SQLMesh and dump the state from the new version.
208+
The new version was printed in the original error message, eg `You are running '0.165.1'`
209+
210+
To upgrade SQLMesh, simply install the new version:
211+
212+
```bash
213+
(migration-env)$ pip install --upgrade "sqlmesh==0.165.1"
214+
```
215+
216+
Migrate the state to the new version:
217+
218+
```bash
219+
(migration-env)$ sqlmesh --gateway migration migrate
220+
```
221+
222+
And finally, create a new state file which is now compatible with the new SQLMesh version:
223+
224+
```
225+
(migration-env)$ sqlmesh --gateway migration state dump -o state-migrated.json
226+
```
227+
228+
The `state-migrated.json` file is now compatible with the newer version of SQLMesh.
229+
You can then transfer it to the place you originally needed it and load it in:
230+
231+
```bash
232+
$ sqlmesh state load -i state-migrated.json
233+
...
234+
State loaded successfully from 'state-migrated.json'
235+
```

docs/reference/cli.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Commands:
4040
rewrite Rewrite a SQL expression with semantic...
4141
rollback Rollback SQLMesh to the previous migration.
4242
run Evaluate missing intervals for the target...
43+
state Commands for interacting with state
4344
table_diff Show the diff between two tables.
4445
table_name Prints the name of the physical table for the...
4546
test Run model unit tests.
@@ -455,6 +456,49 @@ Options:
455456
--help Show this message and exit.
456457
```
457458

459+
## state
460+
461+
```
462+
Usage: sqlmesh state [OPTIONS] COMMAND [ARGS]...
463+
464+
Commands for interacting with state
465+
466+
Options:
467+
--help Show this message and exit.
468+
469+
Commands:
470+
dump Dump the state database to a file
471+
load Load a state dump back into a database
472+
```
473+
474+
### dump
475+
476+
```
477+
Usage: sqlmesh state dump [OPTIONS]
478+
479+
Dump the state database to a file
480+
481+
Options:
482+
-o, --output-file FILE Path to write the state dump to [required]
483+
--no-confirm Do not prompt for confirmation before dumping
484+
existing state
485+
--help Show this message and exit.
486+
```
487+
488+
### load
489+
490+
```
491+
Usage: sqlmesh state load [OPTIONS]
492+
493+
Load a state dump file back into the state database
494+
495+
Options:
496+
-i, --input-file FILE Path to the state dump file [required]
497+
--no-confirm Do not prompt for confirmation before overwriting
498+
existing state
499+
--help Show this message and exit.
500+
```
501+
458502
## table_diff
459503

460504
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ nav:
4141
- concepts/environments.md
4242
- concepts/tests.md
4343
- concepts/audits.md
44+
- concepts/state.md
4445
- Models:
4546
- concepts/models/overview.md
4647
- concepts/models/model_kinds.md

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,5 +203,6 @@ module = [
203203
"pydantic_core.*",
204204
"dlt.*",
205205
"bigframes.*",
206+
"json_stream.*"
206207
]
207208
ignore_missing_imports = true

sqlmesh/cli/main.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
from sqlmesh.core.context import Context
1919
from sqlmesh.utils.date import TimeLike
2020
from sqlmesh.utils.errors import MissingDependencyError
21+
from pathlib import Path
2122

2223
logger = logging.getLogger(__name__)
2324

@@ -1030,3 +1031,53 @@ def lint(
10301031
) -> None:
10311032
"""Run the linter for the target model(s)."""
10321033
obj.lint_models(models)
1034+
1035+
1036+
@cli.group(no_args_is_help=True)
1037+
def state() -> None:
1038+
"""Commands for interacting with state"""
1039+
pass
1040+
1041+
1042+
@state.command("dump")
1043+
@click.option(
1044+
"-o",
1045+
"--output-file",
1046+
required=True,
1047+
help="Path to write the state dump to",
1048+
type=click.Path(dir_okay=False, writable=True, path_type=Path),
1049+
)
1050+
@click.option(
1051+
"--no-confirm",
1052+
is_flag=True,
1053+
help="Do not prompt for confirmation before dumping existing state",
1054+
)
1055+
@click.pass_obj
1056+
@error_handler
1057+
@cli_analytics
1058+
def state_dump(obj: Context, output_file: Path, no_confirm: bool) -> None:
1059+
"""Dump the state database to a file"""
1060+
confirm = not no_confirm
1061+
obj.dump_state(output_file=output_file, confirm=confirm)
1062+
1063+
1064+
@state.command("load")
1065+
@click.option(
1066+
"-i",
1067+
"--input-file",
1068+
help="Path to the state dump file",
1069+
required=True,
1070+
type=click.Path(exists=True, dir_okay=False, readable=True, path_type=Path),
1071+
)
1072+
@click.option(
1073+
"--no-confirm",
1074+
is_flag=True,
1075+
help="Do not prompt for confirmation before overwriting existing state",
1076+
)
1077+
@click.pass_obj
1078+
@error_handler
1079+
@cli_analytics
1080+
def state_load(obj: Context, input_file: Path, no_confirm: bool) -> None:
1081+
"""Load a state dump file back into the state database"""
1082+
confirm = not no_confirm
1083+
obj.load_state(input_file=input_file, confirm=confirm)

0 commit comments

Comments
 (0)