Skip to content

Commit 27c6a44

Browse files
committed
Feat: state import/export
1 parent d015490 commit 27c6a44

15 files changed

Lines changed: 2171 additions & 1 deletion

File tree

docs/concepts/state.md

Lines changed: 272 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,272 @@
1+
# State
2+
3+
SQLMesh stores information about your project in a state database that is usually separate from your main warehouse.
4+
5+
The SQLMesh state database contains:
6+
7+
- Information about every [Model Version](./models/overview.md) in your project (query, loaded intervals, dependencies)
8+
- A list of every [Virtual Data Environment](./environments.md) in the project
9+
- Which model versions are [promoted](./plans.md#plan-application) into each [Virtual Data Environment](./environments.md)
10+
- Information about any [auto restatements](./models/overview.md#auto_restatement_cron) present in your project
11+
- Other metadata about your project such as current SQLMesh / SQLGlot version
12+
13+
The state database is how SQLMesh "remembers" what it's done before so it can compute a minimum set of operations to apply changes instead of rebuilding everything every time. It's also how SQLMesh tracks what historical data has already been backfilled for [incremental models](./models/model_kinds.md#incremental_by_time_range) so you dont need to add branching logic into the model query to handle this.
14+
15+
!!! info "State database performance"
16+
17+
The workload against the state database is an OLTP workload that requires transaction support in order to work correctly.
18+
19+
For the best experience, we recommend [Tobiko Cloud](../cloud/cloud_index.md) or databases designed for OLTP workloads such as [PostgreSQL](../integrations/engines/postgres.md).
20+
21+
Using your warehouse OLAP database to store state is supported for proof-of-concept projects but is not suitable for production and **will** lead to poor performance and consistency.
22+
23+
For more information on engines suitable for the SQLMesh state database, see the [configuration guide](../guides/configuration.md#state-connection).
24+
25+
## Exporting / Importing State
26+
27+
SQLMesh supports exporting the state database to a `.json` file. From there, you can inspect the file with any tool that can read text files. You can also pass the file around and import it back in to a SQLMesh project running elsewhere.
28+
29+
### Exporting state
30+
31+
SQLMesh can export the state database to a file like so:
32+
33+
```bash
34+
$ sqlmesh state export -o state.json
35+
Exporting state to 'state.json' from the following connection:
36+
37+
Gateway: dev
38+
State Connection:
39+
├── Type: postgres
40+
├── Catalog: sushi_dev
41+
└── Dialect: postgres
42+
43+
Continue? [y/n]: y
44+
45+
Exporting versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
46+
Exporting snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17/17 • 0:00:00
47+
Exporting environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
48+
49+
State exported successfully to 'state.json'
50+
```
51+
52+
This will produce a file `state.json` in the current directory containing the SQLMesh state.
53+
54+
The state file is a simple `json` file that looks like:
55+
56+
```json
57+
{
58+
/* State export metadata */
59+
"metadata": {
60+
"timestamp": "2025-03-16 23:09:00+00:00", /* UTC timestamp of when the file was produced */
61+
"file_version": 1, /* state export file format version */
62+
"importable": true /* whether or not this file can be imported with `sqlmesh state import` */
63+
},
64+
/* Library versions used to produce this state export file */
65+
"versions": {
66+
"schema_version": 76 /* sqlmesh state database schema version */,
67+
"sqlglot_version": "26.10.1" /* version of SQLGlot used to produce the state file */,
68+
"sqlmesh_version": "0.165.1" /* version of SQLMesh used to produce the state file */,
69+
},
70+
/* array of objects containing every Snapshot (physical table) tracked by the SQLMesh project */
71+
"snapshots": [
72+
{ "name": "..." }
73+
],
74+
/* object for every Virtual Data Environment in the project. key = environment name, value = environment details */
75+
"environments": {
76+
"prod": {
77+
"..."
78+
}
79+
}
80+
}
81+
```
82+
83+
#### Specific environments
84+
85+
You can export a specific environment like so:
86+
87+
```sh
88+
$ sqlmesh state export --environment my_dev -o my_dev_state.json
89+
```
90+
91+
Note that every snapshot that is part of the environment will be exported, not just the differences from `prod`. The reason for this is so that the environment can be fully imported elsewhere without any assumptions about which snapshots are already present in state.
92+
93+
#### Local state
94+
95+
You can export local state like so:
96+
97+
```bash
98+
$ sqlmesh state export --local -o local_state.json
99+
```
100+
101+
This essentially just exports the state of the local context which includes local changes that have not been applied to any virtual data environments.
102+
103+
Therefore, a local state export will only have `snapshots` populated. `environments` will be empty because virtual data environments are only present in the warehouse / remote state. In addition, the file is marked as **not importable** so it cannot be used with a subsequent `sqlmesh state import` command.
104+
105+
### Importing state
106+
107+
!!! warning "Back up your state database first!"
108+
109+
Please ensure you have created an independent backup of your state database in case something goes wrong during the state import.
110+
111+
SQLMesh tries to wrap the state import in a transaction but some database engines do not support transactions against DDL which means
112+
a import error has the potential to leave the state database in an inconsistent state.
113+
114+
SQLMesh can import a state file into the state database like so:
115+
116+
```bash
117+
$ sqlmesh state import -i state.json --replace
118+
Loading state from 'state.json' into the following connection:
119+
120+
Gateway: dev
121+
State Connection:
122+
├── Type: postgres
123+
├── Catalog: sushi_dev
124+
└── Dialect: postgres
125+
126+
[WARNING] This destructive operation will delete all existing state against the 'dev' gateway
127+
and replace it with what\'s in the 'state.json' file.
128+
129+
Are you sure? [y/n]: y
130+
131+
State File Information:
132+
├── Creation Timestamp: 2025-03-31 02:15:00+00:00
133+
├── File Version: 1
134+
├── SQLMesh version: 0.170.1.dev0
135+
├── SQLMesh migration version: 76
136+
└── SQLGlot version: 26.12.0
137+
138+
Importing versions ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 3/3 • 0:00:00
139+
Importing snapshots ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17/17 • 0:00:00
140+
Importing environments ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 1/1 • 0:00:00
141+
142+
State imported successfully from 'state.json'
143+
```
144+
145+
Note that the state database structure needs to be present and up to date, so run `sqlmesh migrate` before running `sqlmesh state import` if you get a version mismatch error.
146+
147+
If you have a partial state export, perhaps for a single environment - you can merge it in by omitting the `--replace` parameter:
148+
149+
```bash
150+
$ sqlmesh state import -i state.json
151+
...
152+
153+
[WARNING] This operation will merge the contents of the state file to the state located at the 'dev' gateway.
154+
Matching snapshots or environments will be replaced.
155+
Non-matching snapshots or environments will be ignored.
156+
157+
Are you sure? [y/n]: y
158+
159+
...
160+
State imported successfully from 'state.json'
161+
```
162+
163+
164+
### Specific gateways
165+
166+
If your project has [multiple gateways](../guides/configuration.md#gateways) with different state connections per gateway, you can target the [state_connection](../guides/configuration.md#state-connection) of a specific gateway like so:
167+
168+
```bash
169+
# state export
170+
$ sqlmesh --gateway <gateway> state export -o state.json
171+
172+
# state import
173+
$ sqlmesh --gateway <gateway> state import -i state.json
174+
```
175+
176+
## Version Compatibility
177+
178+
When importing state, the state file must have been produced with the same major and minor version of SQLMesh that is being used to import it.
179+
180+
If you attempt to import state with an incompatible version, you will get the following error:
181+
182+
```bash
183+
$ sqlmesh state import -i state.json
184+
...SNIP...
185+
186+
State import failed!
187+
Error: SQLMesh version mismatch. You are running '0.165.1' but the state file was created with '0.164.1'.
188+
Please upgrade/downgrade your SQLMesh version to match the state file before performing the import.
189+
```
190+
191+
### Upgrading a state file
192+
193+
You can upgrade a state file produced by an old SQLMesh version to be compatible with a newer SQLMesh version by:
194+
195+
- Loading it into a local database using the older SQLMesh version
196+
- Installing the newer SQLMesh version
197+
- Running `sqlmesh migrate` to upgrade the state within the local database
198+
- Running `sqlmesh state export` to export it back out again. The new export is now compatible with the newer version of SQLMesh.
199+
200+
Below is an example of how to upgrade a state file created with SQLMesh `0.164.1` to be compatible with SQLMesh `0.165.1`.
201+
202+
First, create and activate a virtual environment to isolate the SQLMesh versions from your main environment:
203+
204+
```bash
205+
$ python -m venv migration-env
206+
207+
$ . ./migration-env/bin/activate
208+
209+
(migration-env)$
210+
```
211+
212+
Install the SQLMesh version compatible with your state file. The correct version to use is printed in the error message, eg `the state file was created with '0.164.1'` means you need to install SQLMesh `0.164.1`:
213+
214+
```bash
215+
(migration-env)$ pip install "sqlmesh==0.164.1"
216+
```
217+
218+
Add a gateway to your `config.yaml` like so:
219+
220+
```yaml
221+
gateways:
222+
migration:
223+
connection:
224+
type: duckdb
225+
database: ./state-migration.duckdb
226+
```
227+
228+
The goal here is to define just enough config for SQLMesh to be able to use a local database to run the state export/import commands. SQLMesh still needs to inherit things like the `model_defaults` from your project in order to migrate state correctly which is why we have not used an isolated directory.
229+
230+
!!! warning
231+
232+
From here on, be sure to specify `--gateway migration` to all SQLMesh commands or you run the risk of accidentally clobbering any state on your main gateway
233+
234+
You can now import your state export using the same version of SQLMesh it was created with:
235+
236+
```bash
237+
(migration-env)$ sqlmesh --gateway migration migrate
238+
239+
(migration-env)$ sqlmesh --gateway migration state import -i state.json
240+
...
241+
State imported successfully from 'state.json'
242+
```
243+
244+
Now we have the state imported, we can upgrade SQLMesh and export the state from the new version.
245+
The new version was printed in the original error message, eg `You are running '0.165.1'`
246+
247+
To upgrade SQLMesh, simply install the new version:
248+
249+
```bash
250+
(migration-env)$ pip install --upgrade "sqlmesh==0.165.1"
251+
```
252+
253+
Migrate the state to the new version:
254+
255+
```bash
256+
(migration-env)$ sqlmesh --gateway migration migrate
257+
```
258+
259+
And finally, create a new state file which is now compatible with the new SQLMesh version:
260+
261+
```bash
262+
(migration-env)$ sqlmesh --gateway migration state export -o state-migrated.json
263+
```
264+
265+
The `state-migrated.json` file is now compatible with the newer version of SQLMesh.
266+
You can then transfer it to the place you originally needed it and import it in:
267+
268+
```bash
269+
$ sqlmesh state import -i state-migrated.json
270+
...
271+
State imported successfully from 'state-migrated.json'
272+
```

docs/reference/cli.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Commands:
4040
rewrite Rewrite a SQL expression with semantic...
4141
rollback Rollback SQLMesh to the previous migration.
4242
run Evaluate missing intervals for the target...
43+
state Commands for interacting with state
4344
table_diff Show the diff between two tables.
4445
table_name Prints the name of the physical table for the...
4546
test Run model unit tests.
@@ -455,6 +456,56 @@ Options:
455456
--help Show this message and exit.
456457
```
457458

459+
## state
460+
461+
```
462+
Usage: sqlmesh state [OPTIONS] COMMAND [ARGS]...
463+
464+
Commands for interacting with state
465+
466+
Options:
467+
--help Show this message and exit.
468+
469+
Commands:
470+
export Export the state database to a file
471+
import Import a state export file back into the state database
472+
```
473+
474+
### export
475+
476+
```
477+
Usage: sqlmesh state export [OPTIONS]
478+
479+
Export the state database to a file
480+
481+
Options:
482+
-o, --output-file FILE Path to write the state export to [required]
483+
--environment TEXT Name of environment to export. Specify multiple
484+
--environment arguments to export multiple
485+
environments
486+
--local Export local state only. Note that the resulting
487+
file will not be importable
488+
--no-confirm Do not prompt for confirmation before exporting
489+
existing state
490+
--help Show this message and exit.
491+
```
492+
493+
### import
494+
495+
```
496+
Usage: sqlmesh state import [OPTIONS]
497+
498+
Import a state export file back into the state database
499+
500+
Options:
501+
-i, --input-file FILE Path to the state file [required]
502+
--replace Clear the remote state before loading the file. If
503+
omitted, a merge is performed instead
504+
--no-confirm Do not prompt for confirmation before updating
505+
existing state
506+
--help Show this message and exit.
507+
```
508+
458509
## table_diff
459510

460511
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ nav:
4141
- concepts/environments.md
4242
- concepts/tests.md
4343
- concepts/audits.md
44+
- concepts/state.md
4445
- Models:
4546
- concepts/models/overview.md
4647
- concepts/models/model_kinds.md

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ dependencies = [
2424
"sqlglot[rs]~=26.12.0",
2525
"tenacity",
2626
"time-machine",
27+
"json-stream"
2728
]
2829
classifiers = [
2930
"Intended Audience :: Developers",
@@ -203,5 +204,6 @@ module = [
203204
"pydantic_core.*",
204205
"dlt.*",
205206
"bigframes.*",
207+
"json_stream.*"
206208
]
207209
ignore_missing_imports = true

0 commit comments

Comments
 (0)