Interactive Exercise: dbt Incremental Materialization

Time Machine Factory – dbt Incremental Materialization
Time Machine Factory

Master dbt incremental materialization by processing data through time!

🏭Your Mission: Build an Efficient Data Pipeline
You’re the lead data engineer at Time Machine Factory, processing millions of product events daily. Full table refreshes take 6 hours! Your mission: implement incremental materialization to process only new and changed records.

Learn: Incremental strategies, unique keys, surrogate keys, Jinja macros, and optimization techniques.
How to use this exercise
  1. Select a day on the timeline (use Jan 4 or Jan 5 for data).
  2. Choose a strategy (Append / Merge / Delete+Insert / Insert Overwrite).
  3. (Optional) Switch the tabs (Model / Config / Compiled) to view the code.
  4. Click Run Incremental.
  5. (Optional) Click Compare Full Refresh or Best Practices.

Already ran all days? Use Reset Exercise to start over.

📊Data Stream Timeline
Jan 1
10K
Initial Load
Jan 2
2.5K
Processed
Jan 3
3.1K
Processed
Jan 4
2.8K
Ready
Jan 5
4.2K
New Data!
Append
Add new records only
🔄
Merge
Upsert with unique key
♻️
Delete+Insert
Replace date partitions
📝
Insert Overwrite
Partition replacement
📝dbt Model Code
— models/marts/product_events.sql {{ config( materialized = ‘incremental’, incremental_strategy = ‘append’, on_schema_change = ‘fail’ ) }} WITH source_data AS ( SELECT product_id, event_type, quantity, price, quantity * price AS total_value, created_at, CURRENT_TIMESTAMP() AS updated_at, — Generate surrogate key using dbt_utils {{ dbt_utils.generate_surrogate_key([ ‘product_id’, ‘created_at’ ]) }} AS surrogate_key FROM {{ ref(‘staging_product_events’) }} {% if is_incremental() %} — Only process new records since last run WHERE created_at > ( SELECT MAX(created_at) FROM {{ this }} ) {% endif %} ) SELECT * FROM source_data
📈 Processing Results
15,600
Total Rows
0
New Rows
0
Updated
0s
Process Time
💡 Append Strategy: Only adds new records based on a timestamp. Perfect for immutable event data. No unique_key needed, fastest performance.
product_id event_type quantity price created_at updated_at surrogate_key
Click “Run Incremental” to process data