You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .agents/skills/optimize-storage-costs/SKILL.md
+16-10Lines changed: 16 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,12 +13,13 @@ Identify and remove BigQuery tables that contribute to storage costs but have no
13
13
14
14
Masthead Data uses lineage analysis to identify tables, but relies on visible pipeline references. Modification timestamps are critical:
15
15
16
-
| Type | Definition | Indicators | Watch for |
17
-
|------|------------|------------|---|
18
-
|**Dead-end**| Regularly updated, no downstream consumption | Updated but never read in 30+ days | External writers outside lineage graph (manual jobs, independent pipelines) |
19
-
|**Unused**| No upstream or downstream activity | No reads/writes in 30+ days | Recent `lastModifiedTime` despite "Unused" flag suggests external writer—**do not drop without verification**|
|**Dead-end**| Regularly updated, no downstream consumption | Updated but never read in 30+ days | External writers outside lineage graph (manual jobs, independent pipelines) |
19
+
|**Unused**| No upstream or downstream activity | No reads/writes in 30+ days| Recent `lastModifiedTime` despite "Unused" flag suggests external writer—**do not drop without verification**|
20
20
21
21
### Key Signal
22
+
22
23
If a table is flagged `Unused`**and** has a recent modification timestamp, something outside Masthead's visibility is writing to it. This always warrants investigation before dropping.
23
24
24
25
## When to Use
@@ -60,18 +61,21 @@ ORDER BY savings_usd_30d DESC" > storage_waste.csv
60
61
**Note:** Sorting by `savings_usd_30d` instead of `total_tib` prioritizes high-impact targets for review.
61
62
62
63
**Alternative: Use Masthead UI**
64
+
63
65
- Navigate to [Dictionary page](https://app.mastheadata.com/dictionary?tab=Tables&deadEnd=true)
64
66
- Filter by `Dead-end` or `Unused` labels
65
67
- Export table list for review
66
68
67
69
### Step 2: Review and Decide
68
70
69
71
Review `storage_waste.csv` and add a `status` column with values:
72
+
70
73
-`keep` - Table is needed
71
74
-`to drop` - Safe to remove
72
75
-`investigate` - Needs further analysis
73
76
74
77
**Review criteria:**
78
+
75
79
- Is this a backup or archive table? (consider alternative storage)
76
80
- Is there a downstream dependency not captured in lineage?
77
81
- Is this table part of an active experiment or migration?
@@ -94,6 +98,7 @@ bash drop_tables.sh
94
98
```
95
99
96
100
**Safe mode (dry-run first):**
101
+
97
102
```bash
98
103
# Add --dry-run flag to each command
99
104
sed 's/bq rm/bq rm --dry-run/' drop_tables.sh > drop_tables_dryrun.sh
@@ -103,17 +108,18 @@ bash drop_tables_dryrun.sh
103
108
### Step 4: Verify Savings
104
109
105
110
After 24-48 hours, check storage reduction in Masthead:
0 commit comments