Return to course: Analytics Engineering
Analytics Engineering
Previous Lesson
Previous
Next
Next Section
Analytics Engineering
Grades
Resources
Module 1: Welcome to Analytics Engineering!
Module 1: Lesson
Module 2: Data Fundamentals
Module 2: Instructor Lesson
Interactive Lesson: Data Architecture Challenge
Interactive Lesson: Data Detective Challenge
Module 2: Homework - BigQuery Data Structures
Module 2: Homework - Fivetran BigQuery ELT
Module 2 Quiz
Module 3: SQL for Analytics Engineers
Module 3: Lesson
Module 3: Walkthrough - SQL SELECT / DISTINCT
Module 3: Walkthrough - SQL FILTERING / WHERE
Module 3: Walkthrough - SQL ORDER BY / LIMIT
Module 3: Walkthrough - SQL AGGREGATIONS
Module 3: Walkthrough - SQL Aggregations, String Filtering, Having
Module 3: Instructor Walkthrough - SQL Inner, Left, and Complex Joins
Module 3: Instructor Walkthrough - SQL Subqueries in FROM and WHERE
Module 3: Instructor Walkthrough - SQL CASE Statements
Module 3: Instructor Walkthrough - SQL Set Operators & DateTime Functions
Module 3: Instructor Walkthrough - SQL String Functions
Module 3: Instructor Walkthrough - SQL Scalar and Numeric Functions
Module 3: Instructor Walkthrough - SQL Performance Optimizations
Module 3: Instructor Walkthrough - SQL Styling and Formatting
Module 3: Instructor Walkthrough - SQL Interview Questions & Tips
Module 3 Quiz
Interactive Lesson: SQL Rescue Quest
Interactive Lesson: Advanced SQL Space Station
Module 4: Data Modeling and Architecture
Module 4: Lesson
Interactive Lesson: Data Modeling
Interactive Lesson: Normalization Ride Share
Interactive Lesson: Slowly Changing Dimensions
Module 4 Quiz
Module 5: dbt and Github
Module 5: Lesson
Interactive Exercise: Github Workflows
Interactive Exercise: dbt Incremental Materialization
Module 5 Quiz
Module 6: Data Quality and Testing
Module 6: Lesson
Interactive Lesson: Anomaly Detection Bollinger Bands
Interactive Lesson: Data Quality Investigation
Interactive Lesson: Great Expectations
Interactive Lesson: dbt Testing
Module 6 Quiz
Module 7: Programming for Analytics Engineers
Module 7: Lesson
Interactive Lesson: Python Food Delivery
Module 7 Quiz
Module 8: Visualization and Reporting
Module 8: Lesson
Interactive Lesson: Dashboard Design Simulator
Module 8: Homework - Looker Studio Marketing Sales and Spend
Module 8 Quiz
Module 9: AI Tools Mastery
Interactive Lesson: AI Tools for Analytics Engineering
Module 10: Analytics Engineering Capstone Project
Capstone Intro
Accounts and Access
Module 7 Quiz
1. A Jupyter Notebook is best described as an environment that lets you combine:
*
A) Only SQL queries
B) Code, results, and Markdown in one place
C) Web dashboards only
D) Only terminal commands
2. A key reason analytics engineers use notebooks is:
*
A) Production orchestration only
B) Rapid prototyping and exploratory data analysis
C) GPU model training only
D) Schema migration only
3. Why is Python popular for analytics engineering?
*
A) Small community, low support
B) Limited libraries for data
C) Large ecosystem (pandas, NumPy, scikit-learn) and strong community
D) Only works on Windows
4. A recommended way to install Python & Jupyter is:
*
A) Copying binaries by hand
B) Using Conda/Anaconda or pip with virtual environments
C) Editing the system registry
D) Browser plug-ins
5. Python blocks are defined primarily by:
*
A) Curly braces
B) Semicolons
C) Indentation (typically 4 spaces)
D) Tabs only
6. The Python feature for embedding variables into strings is:
*
A) strcat
B) f-strings
C) format_map only
D) Byte strings
7. Which data structure is ordered and mutable?
*
A) List
B) Tuple
C) Set
D) Dictionary keys
8. A tuple is ideal when you need:
*
A) A mutable sequence
B) An immutable, fixed collection (e.g., coordinates)
C) Automatic uniqueness
D) Key-value storage
9. A dictionary is best described as:
*
A) Ordered and immutable
B) Unordered with unique-only values
C) Key-value pairs with common methods like items(), keys(), values()
D) A 2D table for analytics
10. A set is useful when you need:
*
A) Duplicate-friendly ordered lists
B) Unordered collection of unique elements
C) Immutable key-value pairs
D) Strict indexing by position
11. In pandas, the primary data structures are:
*
A) RDD and DataSet
B) Series (1D) and DataFrame (2D)
C) Table and View
D) Graph and Node
12. Pandas can read/write which formats out of the box?
*
A) Only CSV
B) CSV, Excel, SQL, JSON (among others)
C) Images only
D) Audio files only
13. To quickly inspect a DataFrame you would use:
*
A) head(), tail(), info(), describe()
B) compile(), explain(), optimize()
C) render(), plot3D()
D) unique(), ndim() only
14. Filtering rows in pandas commonly uses:
*
A) Boolean indexing (e.g., df[df["col"] > 0])
B) SQL cursors
C) XML selectors
D) Shell pipelines
15. Typical data-cleaning methods in pandas include:
*
A) dropna(), fillna(), rename(), astype(), and string .str methods
B) clear(), reset() only
C) compress(), encrypt()
D) mkdir(), chmod()
16. To summarize data by groups you would use:
*
A) groupby() with agg() (e.g., sum, mean, count)
B) over() window only
C) collect_set()
D) explode() exclusively
17. To create pivoted summaries in pandas you would use:
*
A) pivot_table()
B) reshape()
C) melt_only()
D) transpose_only()
18. For orchestration and reliability in real workflows, the module suggests:
*
A) Manual daily runs only
B) Scheduling with tools like Airflow, plus automated quality checks and alerts
C) Emailing CSVs by hand
D) Running notebooks once per year