Skip to content

Commit 92dfc8d

Browse files
committed
add external model
1 parent 01d3b88 commit 92dfc8d

1 file changed

Lines changed: 124 additions & 3 deletions

File tree

docs/examples/sqlmesh_cli_crash_course.md

Lines changed: 124 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -325,12 +325,12 @@ Run data diff against prod. This is a good way to verify the changes are behavin
325325

326326
## **Enhanced Testing Workflow**
327327

328-
You'll use these commands ad hoc to validate your changes are behaving as expected. Audits (data tests) are a great first step, and you'll want to evolve into to feel confident about the changes. The workflow is as follows:
328+
You'll use these commands to validate your changes are behaving as expected. Audits (data tests) are a great first step, and you'll want to grow from there to feel confident about your changes. The workflow is as follows:
329329

330-
1. Create external models outside of SQLMesh's control (ex: data loaded in by Fivetran, Airbyte, etc.)
330+
1. Create and audit external models outside of SQLMesh's control (ex: data loaded in by Fivetran, Airbyte, etc.)
331331
2. Automatically generate unit tests
332332
3. Ad hoc query the data directly in the CLI
333-
333+
4. Add linting
334334

335335
=== "SQLMesh"
336336

@@ -344,6 +344,127 @@ You'll use these commands ad hoc to validate your changes are behaving as expect
344344
tcloud sqlmesh create_external_models
345345
```
346346

347+
??? "Example Output"
348+
Note: this is an example from a separate Tobiko Cloud project, so you can't following along in the github repo above.
349+
350+
- Generated external models from the `bigquery-public-data` dataset.
351+
- I added an audit to the external model to ensure `event_date` is not null.
352+
- Viewed a plan preview of the changes that will be made to the external model.
353+
354+
```sql
355+
-- models/external_model_example.sql
356+
MODEL (
357+
name tcloud_demo.external_model
358+
);
359+
360+
SELECT
361+
event_date,
362+
event_timestamp,
363+
event_name,
364+
event_params,
365+
event_previous_timestamp,
366+
event_value_in_usd,
367+
event_bundle_sequence_id,
368+
event_server_timestamp_offset,
369+
user_id,
370+
user_pseudo_id,
371+
privacy_info,
372+
user_properties,
373+
user_first_touch_timestamp,
374+
user_ltv,
375+
device,
376+
geo,
377+
app_info,
378+
traffic_source,
379+
stream_id,
380+
platform,
381+
event_dimensions,
382+
ecommerce
383+
/* items */
384+
FROM bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_20210131 -- I fully qualified the external table name and sqlmesh will automatically create the external model
385+
```
386+
387+
```yaml
388+
# external_models.yaml
389+
- name: '`bigquery-public-data`.`ga4_obfuscated_sample_ecommerce`.`events_20210131`'
390+
audits: # I added this audit manually to the external model
391+
- name: not_null
392+
columns: "[event_date]"
393+
columns:
394+
event_date: STRING
395+
event_timestamp: INT64
396+
event_name: STRING
397+
event_params: ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value
398+
INT64, float_value FLOAT64, double_value FLOAT64>>>
399+
event_previous_timestamp: INT64
400+
event_value_in_usd: FLOAT64
401+
event_bundle_sequence_id: INT64
402+
event_server_timestamp_offset: INT64
403+
user_id: STRING
404+
user_pseudo_id: STRING
405+
privacy_info: STRUCT<analytics_storage INT64, ads_storage INT64, uses_transient_token
406+
STRING>
407+
user_properties: ARRAY<STRUCT<key INT64, value STRUCT<string_value INT64, int_value
408+
INT64, float_value INT64, double_value INT64, set_timestamp_micros INT64>>>
409+
user_first_touch_timestamp: INT64
410+
user_ltv: STRUCT<revenue FLOAT64, currency STRING>
411+
device: STRUCT<category STRING, mobile_brand_name STRING, mobile_model_name STRING,
412+
mobile_marketing_name STRING, mobile_os_hardware_model INT64, operating_system
413+
STRING, operating_system_version STRING, vendor_id INT64, advertising_id INT64,
414+
language STRING, is_limited_ad_tracking STRING, time_zone_offset_seconds INT64,
415+
web_info STRUCT<browser STRING, browser_version STRING>>
416+
geo: STRUCT<continent STRING, sub_continent STRING, country STRING, region STRING,
417+
city STRING, metro STRING>
418+
app_info: STRUCT<id STRING, version STRING, install_store STRING, firebase_app_id
419+
STRING, install_source STRING>
420+
traffic_source: STRUCT<medium STRING, name STRING, source STRING>
421+
stream_id: INT64
422+
platform: STRING
423+
event_dimensions: STRUCT<hostname STRING>
424+
ecommerce: STRUCT<total_item_quantity INT64, purchase_revenue_in_usd FLOAT64,
425+
purchase_revenue FLOAT64, refund_value_in_usd FLOAT64, refund_value FLOAT64,
426+
shipping_value_in_usd FLOAT64, shipping_value FLOAT64, tax_value_in_usd FLOAT64,
427+
tax_value FLOAT64, unique_items INT64, transaction_id STRING>
428+
items: ARRAY<STRUCT<item_id STRING, item_name STRING, item_brand STRING, item_variant
429+
STRING, item_category STRING, item_category2 STRING, item_category3 STRING,
430+
item_category4 STRING, item_category5 STRING, price_in_usd FLOAT64, price FLOAT64,
431+
quantity INT64, item_revenue_in_usd FLOAT64, item_revenue FLOAT64, item_refund_in_usd
432+
FLOAT64, item_refund FLOAT64, coupon STRING, affiliation STRING, location_id
433+
STRING, item_list_id STRING, item_list_name STRING, item_list_index STRING,
434+
promotion_id STRING, promotion_name STRING, creative_name STRING, creative_slot
435+
STRING>>
436+
gateway: public-demo
437+
```
438+
439+
```bash
440+
Differences from the `dev_sung` environment:
441+
442+
Models:
443+
└── Metadata Updated:
444+
└── "bigquery-public-data".ga4_obfuscated_sample_ecommerce__dev_sung.events_20210131
445+
446+
---
447+
448+
+++
449+
450+
@@ -29,5 +29,6 @@
451+
452+
ecommerce STRUCT<total_item_quantity INT64, purchase_revenue_in_usd FLOAT64, purchase_revenue FLOAT64, refund_value_in_usd FLOAT64, refund_value FLOAT64, shipping_value_in_usd FLOAT64, shipping_value FLOAT64,
453+
tax_value_in_usd FLOAT64, tax_value FLOAT64, unique_items INT64, transaction_id STRING>,
454+
items ARRAY<STRUCT<item_id STRING, item_name STRING, item_brand STRING, item_variant STRING, item_category STRING, item_category2 STRING, item_category3 STRING, item_category4 STRING, item_category5 STRING,
455+
price_in_usd FLOAT64, price FLOAT64, quantity INT64, item_revenue_in_usd FLOAT64, item_revenue FLOAT64, item_refund_in_usd FLOAT64, item_refund FLOAT64, coupon STRING, affiliation STRING, location_id STRING,
456+
item_list_id STRING, item_list_name STRING, item_list_index STRING, promotion_id STRING, promotion_name STRING, creative_name STRING, creative_slot STRING>>
457+
),
458+
+ audits (not_null('columns' = [event_date])),
459+
gateway `public-demo`
460+
)
461+
462+
Metadata Updated: "bigquery-public-data".ga4_obfuscated_sample_ecommerce__dev_sung.events_20210131
463+
Models needing backfill:
464+
└── "bigquery-public-data".ga4_obfuscated_sample_ecommerce__dev_sung.events_20210131: [full refresh]
465+
Apply - Backfill Tables [y/n]:
466+
```
467+
347468
## **Debugging Workflow**
348469

349470
You'll use these commands ad hoc to validate your changes are behaving as expected. Audits (data tests) are a great first step, and you'll want to evolve into to feel confident about the changes. The workflow is as follows:

0 commit comments

Comments
 (0)