You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At Tobiko, we treat security as a first-class citizen because we know how valuable your data assets are. Our team follows and executes security best practices across each layer of our product.
5
+
6
+
## Tobiko Cloud Standard Deployment
7
+
8
+
Our standard Tobiko Cloud deployment consists of several components that are each responsible for different parts of the product.
9
+
10
+
Below is a diagram of the components along with their descriptions.
-**Scheduler**: Orchestrates schedule cadence and hosts state metadata (code versions, logs, cost)
15
+
-**Executor**: Applies code changes and runs SQL queries (actual data processing in SQL Engine) and Python models in proper DAG order.
16
+
-**Gateway**: Stores authentication credentials for SQL Engine. Secured through encryption.
17
+
-**SQL Engine**: Processes and stores data based on the above instructions within the **customer’s** environment.
18
+
19
+
## Tobiko Cloud Hybrid Deployment
20
+
21
+
For some customers, our hybrid deployment option is a great fit. It provides a seamless experience with Tobiko Cloud but within your own VPC and infrastructure.
22
+
23
+
In a hybrid deployment, Tobiko Cloud does not execute tasks directly with the engine. Instead, it passes tasks to the executors hosted in your environment, which then execute the tasks with the engine.
24
+
25
+
Executors are Docker containers that connect to both Tobiko Cloud and your SQL engine. They pull work tasks from the Tobiko Cloud scheduler and execute them with your SQL engine. This is a pull-only mechanism authenticated through an OAuth Client ID/Secret. Whitelist IPs in your network to allow reaching Tobiko Cloud IPs from the executor: 34.28.17.91, 34.136.27.153, 34.136.131.20
26
+
27
+
Below is a diagram of the components along with their description.
-**Scheduler**: Orchestrates schedule cadence and hosts state metadata (code versions, logs, cost). **Never pushes** instructions to executor.
32
+
-**Executor**: Appplies code changes and runs SQL queries and Python models in proper DAG order (actual data processing in SQL Engine)
33
+
-**Gateway**: Stores authentication credentials for SQL Engine. Secured through your secrets manager or Kubernetes Secrets.
34
+
-**SQL Engine**: Processes and stores data based on the above instructions
35
+
-**Executor -> Scheduler**: A pull-only mechanism for obtaining work tasks.
36
+
-**Helm Chart**: For production environements, we provide a [Helm chart](../scheduler/hybrid_executors_helm.md) that includes robust configurability, secret management, and scaling options.
37
+
-**Docker Compose**: For simpler environments or testing, we offer a [Docker Compose setup](../scheduler/hybrid_executors_docker_compose.md) to quickly deploy executors on any machine with Docker.
38
+
39
+
40
+
41
+
## Internal Code Practices
42
+
43
+
We enforce coding standards throughout Tobiko to write, maintain, and collaborate on code effectively. These practice ensure consistency, maintainability, reliability, and most importantly, trust.
44
+
45
+
A few key components of our internal code requirements:
46
+
47
+
- We used signed Git commits, required approvers, and signed Docker artifacts.
48
+
- Each commit to a `main` branch must be approved by someone other than the author.
49
+
- We sign commits and register the key with GitHub ([Github Docs](https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits)).
50
+
- Binaries are signed using cosign and OIDC for keyless ([Signing docs](https://docs.sigstore.dev/cosign/signing/overview/)).
51
+
- Attestations are created to certify an image, enforced with GCP Binary Authorization ([Attestation docs](https://cloud.google.com/binary-authorization/docs/key-concepts#attestations)).
52
+
- Encryption is a key feature of our security posture and is enforced at each stage of access. For example, the state database automatically encrypts all data. Credentials are also securely encrypted and stored.
53
+
- We back up each state database nightly and before upgrades. These backups are stored for 14 days.
54
+
55
+
## Penetration Testing
56
+
57
+
At least once a year, Tobiko engages a third-party security firm to perform a penetration test. This test evaluates our systems by identifying and attempting to exploit known vulnerabilities, focusing on critical external and/or internal assets. A detailed report is available upon request.
58
+
59
+
60
+
## Asset and Access Management
61
+
62
+
### How do we protect PGP keys?
63
+
64
+
If an employee loses their laptop, we don't need to get the old PGP key back because we can invalidate the key directly.
65
+
66
+
We use GitHub to sign code commits. At the time the code was committed, the PGP key was valid. When an employee loses their laptop, we will invalidate it, and they will regenerate a new key to use in future commits. The old commits are still valid because the PGP key was valid at the time the commit was made.
67
+
68
+
### How do we invalidate PGP keys if someone did steal it and could potentially use it?
69
+
70
+
We would revoke access for the GitHub user account associated with the compromised key and not give it access again until the old PGP key is deprecated and a new key issued.
71
+
72
+
### If someone steals a laptop, what's our continuity plan in protecting code?
73
+
74
+
- All employee devices are monitored for proper encryption and password policies.
75
+
- Laptop protection is enforced through file encryption via Vanta.
76
+
- Mandatory lock screen after a timeout.
77
+
- We follow a formal IT asset disposal procedure to prevent key compromise through improper hardware disposal.
78
+
- See above for PGP key protection.
79
+
- Binaries are signed using Cosign and OIDC for keyless signing.
Copy file name to clipboardExpand all lines: docs/concepts/macros/macro_variables.md
+13-3Lines changed: 13 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,11 +130,21 @@ SQLMesh provides additional predefined variables used to modify model behavior b
130
130
* 'auditing' - The audit is being run.
131
131
* 'testing' - The model query logic is being evaluated in the context of a unit test.
132
132
*@gateway - A string value containing the name of the current [gateway](../../guides/connections.md).
133
-
* @this_model - A string value containing the name of the physical table the model view selects from. Typically used to create [generic audits](../audits.md#generic-audits). In the case of [on_virtual_update statements](../models/sql_models.md#optional-on-virtual-update-statements) it contains the qualified view name instead.
134
-
* Can be used in model definitions when SQLGlot cannot fully parse a statement and you need to reference the model's underlying physical table directly.
135
-
* Can be passed as an argument to macros that access or interact with the underlying physical table.
133
+
* @this_model - The physical table name that the model's view selects from. Typically used to create [generic audits](../audits.md#generic-audits). When used in [on_virtual_update statements](../models/sql_models.md#optional-on-virtual-update-statements), it contains the qualified view name instead.
136
134
* @model_kind_name - A string value containing the name of the current model kind. Intended to be used in scenarios where you need to control the [physical properties in model defaults](../../reference/model_configuration.md#model-defaults).
137
135
136
+
!!! note "Embedding variables in strings"
137
+
138
+
Macro variable references sometimes use the curly brace syntax `@{variable}`, which serves a different purpose than the regular `@variable` syntax.
139
+
140
+
The curly brace syntax tells SQLMesh that the rendered string should be treated as an identifier, instead of simply replacing the macro variable value.
141
+
142
+
For example, if `variable` is defined as `@DEF(`variable`, foo.bar)`, then `@variable` produces `foo.bar`, while `@{variable}` produces `"foo.bar"`. This is because SQLMesh converts `foo.bar` into an identifier, using double quotes to correctly include the `.` character in the identifier name.
143
+
144
+
In practice, `@{variable}` is most commonly used to interpolate a value within an identifier, e.g., `@{variable}_suffix`, whereas `@variable` is used to do plain substitutions for string literals.
145
+
146
+
Learn more [above](#embedding-variables-in-strings).
147
+
138
148
#### Before all and after all variables
139
149
140
150
The following variables are also available in [`before_all` and `after_all` statements](../../guides/configuration.md#before_all-and-after_all-statements), as well as in macros invoked within them.
Copy file name to clipboardExpand all lines: docs/concepts/macros/sqlmesh_macros.md
+65-2Lines changed: 65 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,6 +38,59 @@ It uses the following five step approach to accomplish this:
38
38
39
39
5. Modify the semantic representation of the SQL query with the substituted variable values from (3) and functions from (4).
40
40
41
+
### Embedding variables in strings
42
+
43
+
SQLMesh always incorporates macro variable values into the semantic representation of a SQL query (step 5 above). To do that, it infers the role each macro variable value plays in the query.
44
+
45
+
For context, two commonly used types of string in SQL are:
46
+
47
+
- String literals, which represent text values and are surrounded by single quotes, such as `'the_string'`
48
+
- Identifiers, which reference database objects like column, table, alias, and function names
49
+
- They may be unquoted or quoted with double quotes, backticks, or brackets, depending on the SQL dialect
50
+
51
+
In a normal query, SQLMesh can easily determine which role a given string is playing. However, it is more difficult if a macro variable is embedded directly into a string - especially if the string is in the `MODEL` block (and not the query itself).
52
+
53
+
For example, consider a project that defines a [gateway variable](#gateway-variables) named `gateway_var`. The project includes a model that references `@gateway_var` as part of the schema in the model's `name`, which is a SQL *identifier*.
54
+
55
+
This is how we might try to write the model:
56
+
57
+
```sql title="Incorrectly rendered to string literal"
58
+
MODEL (
59
+
name the_@gateway_var_schema.table
60
+
);
61
+
```
62
+
63
+
From SQLMesh's perspective, the model schema is the combination of three sub-strings: `the_`, the value of `@gateway_var`, and `_schema`.
64
+
65
+
SQLMesh will concatenate those strings, but it does not have the context to know that it is building a SQL identifier and will return a string literal.
66
+
67
+
To provide the context SQLMesh needs, you must add curly braces to the macro variable reference: `@{gateway_var}` instead of `@gateway_var`:
68
+
69
+
```sql title="Correctly rendered to identifier"
70
+
MODEL (
71
+
name the_@{gateway_var}_schema.table
72
+
);
73
+
```
74
+
75
+
The curly braces let SQLMesh know that it should treat the string as a SQL identifier, which it will then quote based on the SQL dialect's quoting rules.
76
+
77
+
The most common use of the curly brace syntax is embedding macro variables into strings, it can also be used to differentiate string literals and identifiers in SQL queries. For example, consider a macro variable `my_variable` whose value is `col`.
78
+
79
+
If we `SELECT` this value with regular macro syntax, it will render to a string literal:
80
+
81
+
```sql
82
+
SELECT @my_variable AS the_column; -- renders to SELECT 'col' AS the_column
83
+
```
84
+
85
+
`'col'` is surrounded with single quotes, and the SQL engine will use that string as the column's data value.
86
+
87
+
If we use curly braces, SQLMesh will know that we want to use the rendered string as an identifier:
88
+
89
+
```sql
90
+
SELECT @{my_variable} AS the_column; -- renders to SELECT col AS the_column
91
+
```
92
+
93
+
`col` is not surrounded with single quotes, and the SQL engine will determine that the query is referencing a column or other object named `col`.
41
94
42
95
## User-defined variables
43
96
@@ -174,6 +227,8 @@ SELECT
174
227
FROM @customer.some_source
175
228
```
176
229
230
+
Note the use of both regular `@field_a` and curly brace syntax `@{field_b}` macro variable references in the model query. Learn more [above](#embedding-variables-in-strings)
231
+
177
232
Blueprint variables can be accessed using the syntax shown above, or through the `@BLUEPRINT_VAR()` macro function, which also supports specifying default values in case the variable is undefined (similar to `@VAR()`).
178
233
179
234
### Local variables
@@ -448,7 +503,13 @@ FROM table
448
503
449
504
This syntax works regardless of whether the array values are quoted or not.
450
505
451
-
NOTE: SQLMesh macros support placing macro values at the end of a column name simply using `column_@x`. However if you wish to substitute the variable anywhere else in the identifier, you need to use the more explicit substitution syntax `@{}`. This avoids ambiguity. These are valid uses: `@{x}_column` or `my_@{x}_column`.
506
+
!!! note "Embedding macros in strings"
507
+
508
+
SQLMesh macros support placing macro values at the end of a column name using `column_@x`.
509
+
510
+
However, if you wish to substitute the variable anywhere else in the identifier, you need to use the more explicit curly brace syntax `@{}` to avoid ambiguity. For example, these are valid uses: `@{x}_column` or `my_@{x}_column`.
511
+
512
+
Learn more about embedding macros in strings [above](#embedding-variables-in-strings)
452
513
453
514
### @IF
454
515
@@ -1087,7 +1148,9 @@ The `template` can contain the following placeholders that will be substituted:
1087
1148
-`@{schema_name}` - The name of the physical schema that SQLMesh is using for the model version table, eg `sqlmesh__landing`
1088
1149
-`@{table_name}` - The name of the physical table that SQLMesh is using for the model version, eg `landing__customers__2517971505`
1089
1150
1090
-
It can be used in a `MODEL` block:
1151
+
Note the use of the curly brace syntax `@{}` in the template placeholders - learn more [above](#embedding-variables-in-strings).
1152
+
1153
+
The `@resolve_template` macro can be used in a `MODEL` block:
Copy file name to clipboardExpand all lines: docs/concepts/models/external_models.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,7 +70,9 @@ FROM
70
70
@{gateway}_db.external_table;
71
71
```
72
72
73
-
This table will be named differently depending on which `--gateway` SQLMesh is run with. For example:
73
+
This table will be named differently depending on which `--gateway` SQLMesh is run with (learn more about the curly brace `@{gateway}` syntax [here](../../concepts/macros/sqlmesh_macros.md#embedding-variables-in-strings)).
74
+
75
+
For example:
74
76
75
77
-`sqlmesh --gateway dev plan` - SQLMesh will try to query `dev_db.external_table`
76
78
-`sqlmesh --gateway prod plan` - SQLMesh will try to query `prod_db.external_table`
0 commit comments