Skip to content

Commit 96ed132

Browse files
authored
Feat: state import/export (#4038)
1 parent 4bf7a05 commit 96ed132

15 files changed

Lines changed: 2242 additions & 2 deletions

File tree

docs/concepts/state.md

Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
# State
2+
3+
SQLMesh stores information about your project in a state database that is usually separate from your main warehouse.
4+
5+
The SQLMesh state database contains:
6+
7+
- Information about every [Model Version](./models/overview.md) in your project (query, loaded intervals, dependencies)
8+
- A list of every [Virtual Data Environment](./environments.md) in the project
9+
- Which model versions are [promoted](./plans.md#plan-application) into each [Virtual Data Environment](./environments.md)
10+
- Information about any [auto restatements](./models/overview.md#auto_restatement_cron) present in your project
11+
- Other metadata about your project such as current SQLMesh / SQLGlot version
12+
13+
The state database is how SQLMesh "remembers" what it's done before so it can compute a minimum set of operations to apply changes instead of rebuilding everything every time. It's also how SQLMesh tracks what historical data has already been backfilled for [incremental models](./models/model_kinds.md#incremental_by_time_range) so you dont need to add branching logic into the model query to handle this.
14+
15+
!!! info "State database performance"
16+
17+
The workload against the state database is an OLTP workload that requires transaction support in order to work correctly.
18+
19+
For the best experience, we recommend [Tobiko Cloud](../cloud/cloud_index.md) or databases designed for OLTP workloads such as [PostgreSQL](../integrations/engines/postgres.md).
20+
21+
Using your warehouse OLAP database to store state is supported for proof-of-concept projects but is not suitable for production and **will** lead to poor performance and consistency.
22+
23+
For more information on engines suitable for the SQLMesh state database, see the [configuration guide](../guides/configuration.md#state-connection).
24+
25+
## Exporting / Importing State
26+
27+
SQLMesh supports exporting the state database to a `.json` file. From there, you can inspect the file with any tool that can read text files. You can also pass the file around and import it back in to a SQLMesh project running elsewhere.
28+
29+
### Exporting state
30+
31+
SQLMesh can export the state database to a file like so:
32+
33+
```bash
34+
$ sqlmesh state export -o state.json
35+
Exporting state to 'state.json' from the following connection:
36+
37+
Gateway: dev
38+
State Connection:
39+
├── Type: postgres
40+
├── Catalog: sushi_dev
41+
└── Dialect: postgres
42+
43+
Continue? [y/n]: y
44+
45+
Exporting versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
46+
Exporting snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17/17 • 0:00:00
47+
Exporting environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
48+
49+
State exported successfully to 'state.json'
50+
```
51+
52+
This will produce a file `state.json` in the current directory containing the SQLMesh state.
53+
54+
The state file is a simple `json` file that looks like:
55+
56+
```json
57+
{
58+
/* State export metadata */
59+
"metadata": {
60+
"timestamp": "2025-03-16 23:09:00+00:00", /* UTC timestamp of when the file was produced */
61+
"file_version": 1, /* state export file format version */
62+
"importable": true /* whether or not this file can be imported with `sqlmesh state import` */
63+
},
64+
/* Library versions used to produce this state export file */
65+
"versions": {
66+
"schema_version": 76 /* sqlmesh state database schema version */,
67+
"sqlglot_version": "26.10.1" /* version of SQLGlot used to produce the state file */,
68+
"sqlmesh_version": "0.165.1" /* version of SQLMesh used to produce the state file */,
69+
},
70+
/* array of objects containing every Snapshot (physical table) tracked by the SQLMesh project */
71+
"snapshots": [
72+
{ "name": "..." }
73+
],
74+
/* object for every Virtual Data Environment in the project. key = environment name, value = environment details */
75+
"environments": {
76+
"prod": {
77+
/* information about the environment itself */
78+
"environment": {
79+
"..."
80+
},
81+
/* information about any before_all / after_all statements for this environment */
82+
"statements": [
83+
"..."
84+
]
85+
}
86+
}
87+
}
88+
```
89+
90+
#### Specific environments
91+
92+
You can export a specific environment like so:
93+
94+
```sh
95+
$ sqlmesh state export --environment my_dev -o my_dev_state.json
96+
```
97+
98+
Note that every snapshot that is part of the environment will be exported, not just the differences from `prod`. The reason for this is so that the environment can be fully imported elsewhere without any assumptions about which snapshots are already present in state.
99+
100+
#### Local state
101+
102+
You can export local state like so:
103+
104+
```bash
105+
$ sqlmesh state export --local -o local_state.json
106+
```
107+
108+
This essentially just exports the state of the local context which includes local changes that have not been applied to any virtual data environments.
109+
110+
Therefore, a local state export will only have `snapshots` populated. `environments` will be empty because virtual data environments are only present in the warehouse / remote state. In addition, the file is marked as **not importable** so it cannot be used with a subsequent `sqlmesh state import` command.
111+
112+
### Importing state
113+
114+
!!! warning "Back up your state database first!"
115+
116+
Please ensure you have created an independent backup of your state database in case something goes wrong during the state import.
117+
118+
SQLMesh tries to wrap the state import in a transaction but some database engines do not support transactions against DDL which means
119+
a import error has the potential to leave the state database in an inconsistent state.
120+
121+
SQLMesh can import a state file into the state database like so:
122+
123+
```bash
124+
$ sqlmesh state import -i state.json --replace
125+
Loading state from 'state.json' into the following connection:
126+
127+
Gateway: dev
128+
State Connection:
129+
├── Type: postgres
130+
├── Catalog: sushi_dev
131+
└── Dialect: postgres
132+
133+
[WARNING] This destructive operation will delete all existing state against the 'dev' gateway
134+
and replace it with what\'s in the 'state.json' file.
135+
136+
Are you sure? [y/n]: y
137+
138+
State File Information:
139+
├── Creation Timestamp: 2025-03-31 02:15:00+00:00
140+
├── File Version: 1
141+
├── SQLMesh version: 0.170.1.dev0
142+
├── SQLMesh migration version: 76
143+
└── SQLGlot version: 26.12.0
144+
145+
Importing versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
146+
Importing snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17/17 • 0:00:00
147+
Importing environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
148+
149+
State imported successfully from 'state.json'
150+
```
151+
152+
Note that the state database structure needs to be present and up to date, so run `sqlmesh migrate` before running `sqlmesh state import` if you get a version mismatch error.
153+
154+
If you have a partial state export, perhaps for a single environment - you can merge it in by omitting the `--replace` parameter:
155+
156+
```bash
157+
$ sqlmesh state import -i state.json
158+
...
159+
160+
[WARNING] This operation will merge the contents of the state file to the state located at the 'dev' gateway.
161+
Matching snapshots or environments will be replaced.
162+
Non-matching snapshots or environments will be ignored.
163+
164+
Are you sure? [y/n]: y
165+
166+
...
167+
State imported successfully from 'state.json'
168+
```
169+
170+
171+
### Specific gateways
172+
173+
If your project has [multiple gateways](../guides/configuration.md#gateways) with different state connections per gateway, you can target the [state_connection](../guides/configuration.md#state-connection) of a specific gateway like so:
174+
175+
```bash
176+
# state export
177+
$ sqlmesh --gateway <gateway> state export -o state.json
178+
179+
# state import
180+
$ sqlmesh --gateway <gateway> state import -i state.json
181+
```
182+
183+
## Version Compatibility
184+
185+
When importing state, the state file must have been produced with the same major and minor version of SQLMesh that is being used to import it.
186+
187+
If you attempt to import state with an incompatible version, you will get the following error:
188+
189+
```bash
190+
$ sqlmesh state import -i state.json
191+
...SNIP...
192+
193+
State import failed!
194+
Error: SQLMesh version mismatch. You are running '0.165.1' but the state file was created with '0.164.1'.
195+
Please upgrade/downgrade your SQLMesh version to match the state file before performing the import.
196+
```
197+
198+
### Upgrading a state file
199+
200+
You can upgrade a state file produced by an old SQLMesh version to be compatible with a newer SQLMesh version by:
201+
202+
- Loading it into a local database using the older SQLMesh version
203+
- Installing the newer SQLMesh version
204+
- Running `sqlmesh migrate` to upgrade the state within the local database
205+
- Running `sqlmesh state export` to export it back out again. The new export is now compatible with the newer version of SQLMesh.
206+
207+
Below is an example of how to upgrade a state file created with SQLMesh `0.164.1` to be compatible with SQLMesh `0.165.1`.
208+
209+
First, create and activate a virtual environment to isolate the SQLMesh versions from your main environment:
210+
211+
```bash
212+
$ python -m venv migration-env
213+
214+
$ . ./migration-env/bin/activate
215+
216+
(migration-env)$
217+
```
218+
219+
Install the SQLMesh version compatible with your state file. The correct version to use is printed in the error message, eg `the state file was created with '0.164.1'` means you need to install SQLMesh `0.164.1`:
220+
221+
```bash
222+
(migration-env)$ pip install "sqlmesh==0.164.1"
223+
```
224+
225+
Add a gateway to your `config.yaml` like so:
226+
227+
```yaml
228+
gateways:
229+
migration:
230+
connection:
231+
type: duckdb
232+
database: ./state-migration.duckdb
233+
```
234+
235+
The goal here is to define just enough config for SQLMesh to be able to use a local database to run the state export/import commands. SQLMesh still needs to inherit things like the `model_defaults` from your project in order to migrate state correctly which is why we have not used an isolated directory.
236+
237+
!!! warning
238+
239+
From here on, be sure to specify `--gateway migration` to all SQLMesh commands or you run the risk of accidentally clobbering any state on your main gateway
240+
241+
You can now import your state export using the same version of SQLMesh it was created with:
242+
243+
```bash
244+
(migration-env)$ sqlmesh --gateway migration migrate
245+
246+
(migration-env)$ sqlmesh --gateway migration state import -i state.json
247+
...
248+
State imported successfully from 'state.json'
249+
```
250+
251+
Now we have the state imported, we can upgrade SQLMesh and export the state from the new version.
252+
The new version was printed in the original error message, eg `You are running '0.165.1'`
253+
254+
To upgrade SQLMesh, simply install the new version:
255+
256+
```bash
257+
(migration-env)$ pip install --upgrade "sqlmesh==0.165.1"
258+
```
259+
260+
Migrate the state to the new version:
261+
262+
```bash
263+
(migration-env)$ sqlmesh --gateway migration migrate
264+
```
265+
266+
And finally, create a new state file which is now compatible with the new SQLMesh version:
267+
268+
```bash
269+
(migration-env)$ sqlmesh --gateway migration state export -o state-migrated.json
270+
```
271+
272+
The `state-migrated.json` file is now compatible with the newer version of SQLMesh.
273+
You can then transfer it to the place you originally needed it and import it in:
274+
275+
```bash
276+
$ sqlmesh state import -i state-migrated.json
277+
...
278+
State imported successfully from 'state-migrated.json'
279+
```

docs/reference/cli.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Commands:
4040
rewrite Rewrite a SQL expression with semantic...
4141
rollback Rollback SQLMesh to the previous migration.
4242
run Evaluate missing intervals for the target...
43+
state Commands for interacting with state
4344
table_diff Show the diff between two tables.
4445
table_name Prints the name of the physical table for the...
4546
test Run model unit tests.
@@ -455,6 +456,56 @@ Options:
455456
--help Show this message and exit.
456457
```
457458

459+
## state
460+
461+
```
462+
Usage: sqlmesh state [OPTIONS] COMMAND [ARGS]...
463+
464+
Commands for interacting with state
465+
466+
Options:
467+
--help Show this message and exit.
468+
469+
Commands:
470+
export Export the state database to a file
471+
import Import a state export file back into the state database
472+
```
473+
474+
### export
475+
476+
```
477+
Usage: sqlmesh state export [OPTIONS]
478+
479+
Export the state database to a file
480+
481+
Options:
482+
-o, --output-file FILE Path to write the state export to [required]
483+
--environment TEXT Name of environment to export. Specify multiple
484+
--environment arguments to export multiple
485+
environments
486+
--local Export local state only. Note that the resulting
487+
file will not be importable
488+
--no-confirm Do not prompt for confirmation before exporting
489+
existing state
490+
--help Show this message and exit.
491+
```
492+
493+
### import
494+
495+
```
496+
Usage: sqlmesh state import [OPTIONS]
497+
498+
Import a state export file back into the state database
499+
500+
Options:
501+
-i, --input-file FILE Path to the state file [required]
502+
--replace Clear the remote state before loading the file. If
503+
omitted, a merge is performed instead
504+
--no-confirm Do not prompt for confirmation before updating
505+
existing state
506+
--help Show this message and exit.
507+
```
508+
458509
## table_diff
459510

460511
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ nav:
4141
- concepts/environments.md
4242
- concepts/tests.md
4343
- concepts/audits.md
44+
- concepts/state.md
4445
- Models:
4546
- concepts/models/overview.md
4647
- concepts/models/model_kinds.md

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ dependencies = [
2424
"sqlglot[rs]~=26.12.1",
2525
"tenacity",
2626
"time-machine",
27+
"json-stream"
2728
]
2829
classifiers = [
2930
"Intended Audience :: Developers",
@@ -203,5 +204,6 @@ module = [
203204
"pydantic_core.*",
204205
"dlt.*",
205206
"bigframes.*",
207+
"json_stream.*"
206208
]
207209
ignore_missing_imports = true

0 commit comments

Comments
 (0)