|
| 1 | +# Customizing SQLMesh |
| 2 | + |
| 3 | +SQLMesh supports the workflows used by the vast majority of data engineering teams. However, your company may have bespoke processes or tools that require special integration with SQLMesh. |
| 4 | + |
| 5 | +Fortunately, SQLMesh is an open-source Python library, so you can view its underlying code and customize it for your needs. |
| 6 | + |
| 7 | +Customization generally involves subclassing SQLMesh classes to extend or modify their functionality. |
| 8 | + |
| 9 | +!!! danger "Caution" |
| 10 | + |
| 11 | + Customize SQLMesh with extreme caution. Errors may cause SQLMesh to produce unexpected results. |
| 12 | + |
| 13 | +## Custom loader |
| 14 | + |
| 15 | +Loading is the process of reading project files and converting their contents into SQLMesh's internal Python objects. |
| 16 | + |
| 17 | +The loading stage is a convenient place to customize SQLMesh behavior because you can access a project's objects after they've been ingested from file but before SQLMesh uses them. |
| 18 | + |
| 19 | +SQLMesh's `SqlMeshLoader` class handles the loading process - customize it by subclassing it and overriding its methods. |
| 20 | + |
| 21 | +!!! note "Python configuration only" |
| 22 | + |
| 23 | + Custom loaders require using the [Python configuration format](./configuration.md#python) (YAML is not supported). |
| 24 | + |
| 25 | +### Modify every model |
| 26 | + |
| 27 | +One reason to customize the loading process is to do something to or verify something about every model. For example, you might want to add a post-statement to every model or verify that every model's `owner` field is populated. |
| 28 | + |
| 29 | +The loading process parses all model SQL statements, so new or modified SQL must be parsed by SQLGlot before being passed to a model object. |
| 30 | + |
| 31 | +This custom loader example adds a post-statement to every model: |
| 32 | + |
| 33 | +``` python linenums="1" title="config.py" |
| 34 | +from sqlmesh.core.loader import SqlMeshLoader |
| 35 | +from sqlmesh.utils import UniqueKeyDict |
| 36 | +from sqlmesh.core.dialect import parse_one |
| 37 | +from sqlmesh.core.config import Config |
| 38 | + |
| 39 | +# New `CustomLoader` class subclasses `SqlMeshLoader` |
| 40 | +class CustomLoader(SqlMeshLoader): |
| 41 | + # Override SqlMeshLoader's `_load_models` method to access every model |
| 42 | + def _load_models( |
| 43 | + self, |
| 44 | + macros: "MacroRegistry", |
| 45 | + jinja_macros: "JinjaMacroRegistry", |
| 46 | + gateway: str | None, |
| 47 | + audits: UniqueKeyDict[str, "ModelAudit"], |
| 48 | + signals: UniqueKeyDict[str, "signal"], |
| 49 | + ) -> UniqueKeyDict[str, "Model"]: |
| 50 | + # Call SqlMeshLoader's normal `_load_models` method to ingest models from file and parse model SQL |
| 51 | + models = super()._load_models(macros, jinja_macros, gateway, audits, signals) |
| 52 | + |
| 53 | + new_models = {} |
| 54 | + # Loop through the existing model names/objects |
| 55 | + for model_name, model in models.items(): |
| 56 | + # Create list of existing and new post-statements |
| 57 | + new_post_statements = [ |
| 58 | + # Existing post-statements from model object |
| 59 | + *model.post_statements, |
| 60 | + # New post-statement is raw SQL, so we parse it with SQLGlot's `parse_one` function. |
| 61 | + # Make sure to specify the SQL dialect. |
| 62 | + parse_one(f"UNLOAD ...", dialect="redshift"), |
| 63 | + ] |
| 64 | + # Create a copy of the model with the `post_statements_` field updated |
| 65 | + new_models[model_name] = model.copy(update={"post_statements_": new_post_statements}) |
| 66 | + |
| 67 | + return new_models |
| 68 | + |
| 69 | +# Pass the CustomLoader class to the SQLMesh configuration object |
| 70 | +config = Config( |
| 71 | + # < your configuration parameters here >, |
| 72 | + loader=CustomLoader, |
| 73 | +) |
| 74 | +``` |
0 commit comments