Topic hub

Data Modeling

Star and snowflake schemas, slowly-changing dimensions, fact and dimension tables, and the modeling patterns analytics engineers ship in dbt.

Start free practice See the full course →

Data modeling is the part of analytics engineering that distinguishes a senior engineer from a junior one. Anyone can write a query. Designing the layered set of tables a query *should* run against — staging, intermediate, marts, with the right grain and the right keys — is the actual craft.

This hub covers the modeling patterns analytics engineers use in dbt every day: star schemas, slowly-changing dimensions, fact and dimension tables, normalization vs denormalization, and the trade-offs that come up in real warehouse work. The articles are written for people who write SQL and want to write better SQL — not academic dimensional modeling theory.

Pair this hub with the dbt and SQL hubs, then build the capstone, where you'll design a star-schema mart layer for a marketing-attribution dataset in BigQuery.

What you'll learn

By the end of this path you can…

Design star schemas and know when to denormalize
Implement Type 2 slowly-changing dimensions in dbt with snapshots
Pick the right grain for a fact table
Choose between surrogate and natural keys with eyes open
Model a marketing-attribution or e-commerce dataset end-to-end
Answer modeling questions in technical screens

The learning path

From beginner to job-ready.

01 · Fundamentals
Tables, rows, keys, relationships — the vocabulary the rest of the curriculum is built on.
02 · Normalization
1NF, 2NF, 3NF in practice — and why analytics teams denormalize on purpose.
03 · Star + snowflake schemas
Fact and dimension tables, conformed dimensions, the grain conversation.
04 · Slowly-changing dimensions
Type 1, 2, 3 — and SCD2 with dbt snapshots in production.
05 · Keys
Surrogate vs natural; composite vs single-column; warehouse-specific patterns.
06 · Modeling in dbt
How the staging / intermediate / mart pattern maps to the modeling concepts above.

Articles

Read the playbook.

All resources →

Practice

Drill the patterns.

Data Modeling Quiz (100 Questions)
100 exercises →

Portfolio projects

Show, don't just claim.

In the course

Data Modeling and Architecture

12 lessons in this module

Open module See the full curriculum →

Common questions

Common questions about this topic.

Star schema or snowflake schema?

Star, almost always, for analytics workloads. Snowflaking adds joins; modern columnar warehouses (BigQuery, Snowflake, Databricks) handle wide denormalized tables fine. The exception is huge dimension tables where storage matters more than query speed.

When do I need SCD2?

Whenever a dimension's history affects analysis — customer subscription tier, employee department, product price tiers. If reporting only ever cares about 'as of now', SCD1 (overwrite) is fine. dbt's snapshot block makes SCD2 the easiest pattern to implement.

What's the right grain for a fact table?

The most atomic event the business actually wants to analyze. Orders → one row per order_item, not one row per order, because line-item analysis comes up. Web events → one row per event. Aggregate later; never lose grain.

Surrogate or natural keys?

Surrogate keys for warehouse-side joins (cheap to generate with dbt_utils.generate_surrogate_key). Natural keys for source-system idempotency. The article on the topic walks through both with examples.

Continue your path

Adjacent topics.

Start practicing this topic.

Graded exercises with hints, worked solutions, and a GPT tutor. Free to start, no credit card.

Open free practice See the full course →

Data Modeling

By the end of this path you can…

From beginner to job-ready.

Read the playbook.

Data Modeling Basics: Star Schema vs. Snowflake Schema Explained

Explaining Fact and Dimension Tables for Beginners: Essential Concepts in Data Warehousing

Slowly Changing Dimensions Type 2 Explained: Complete Guide

Top Data Modeling Best Practices for Efficient Analytics Engineering

Understanding Database Normalization and Denormalization: Concepts, Forms, and Applications

Surrogate vs Natural Keys: Choosing the Right Primary Key for Databases

Drill the patterns.

Data Modeling Quiz (100 Questions)

Show, don't just claim.

Sports Equipment Pro Shop

Data Forge: The Lost Metrics

Champion Fantasy League

Data Modeling and Architecture

Common questions about this topic.

Adjacent topics.

dbt

SQL

BigQuery

Interview Prep

Start practicing this topic.