Interactive Lesson: Python Food Delivery
š
Python & Pandas: Food Delivery Analytics
Master data manipulation through real-world delivery data analysis
Progress: 0 / 8 exercises completed
š Score: 0 points
š FoodDeliveryAnalytics.ipynb
Python 3.9 | Ready
Markdown
š Welcome to Food Delivery Analytics!
In this interactive notebook, you’ll analyze real-world food delivery data to uncover insights about restaurant performance, delivery times, and customer satisfaction.
š Dataset Overview:
⢠10,000+ food delivery orders
⢠Restaurant ratings and cuisines
⢠Delivery times and distances
⢠Customer feedback and tips
⢠10,000+ food delivery orders
⢠Restaurant ratings and cuisines
⢠Delivery times and distances
⢠Customer feedback and tips
Learning Objectives:
- Master Python data structures (lists, dictionaries, tuples)
- Use Pandas for data manipulation and analysis
- Clean and transform messy data
- Perform grouping and aggregations
- Build a complete ETL pipeline
Code
In [1]:
# Exercise 1: Python Basics – Working with Restaurant Data
# Let’s start by creating basic data structures for our food delivery system
# Create a list of popular cuisines
cuisines = [“Italian”, “Chinese”, “Mexican”, “Indian”, “Thai”]
# Create a dictionary for a restaurant
restaurant = {
“name”: “Pizza Paradise”,
“cuisine”: “Italian”,
“rating”: 4.5,
“delivery_time”: 30,
“is_open”: True
}
# Use f-strings to create a description
description = f”{restaurant[‘name’]} serves {restaurant[‘cuisine’]} food with a {restaurant[‘rating’]}ā
rating”
print(description)
# Create a tuple for coordinates (immutable)
restaurant_location = (37.7749, -122.4194) # San Francisco
print(f”Location: {restaurant_location[0]}°N, {restaurant_location[1]}°W”)
Code
In [2]:
EXERCISE
Hints:
⢠Use
⢠Dictionary format:
⢠Sets use curly braces:
⢠Use
cuisines.append("Japanese")
to add items⢠Dictionary format:
{"key": "value", "key2": value2}
⢠Sets use curly braces:
{"Zone1", "Zone2", "Zone3"}
Code
In [3]:
# Exercise 2: Introduction to Pandas – Loading Delivery Data
import pandas as pd
import numpy as np
# Create sample delivery data
delivery_data = {
‘order_id’: [1001, 1002, 1003, 1004, 1005],
‘restaurant’: [‘Pizza Paradise’, ‘Sushi Supreme’, ‘Taco Town’, ‘Burger Barn’, ‘Pizza Paradise’],
‘cuisine’: [‘Italian’, ‘Japanese’, ‘Mexican’, ‘American’, ‘Italian’],
‘order_value’: [25.99, 45.50, 18.75, 22.00, 31.25],
‘delivery_time’: [28, 35, 22, 25, 32],
‘rating’: [5, 4, 5, 3, 4],
‘tip’: [5.00, 8.00, 3.00, 2.00, 6.00]
}
# Create DataFrame
df = pd.DataFrame(delivery_data)
# Display basic information
print(“š Food Delivery Orders Dataset”)
print(“=” * 40)
print(df.head())
print(f”\nDataset shape: {df.shape}“)
print(f”Total orders: {len(df)}“)
Code
In [4]:
EXERCISE
Code
In [5]:
# Exercise 4: Data Filtering & Selection
# Expand our dataset
np.random.seed(42)
expanded_data = {
‘order_id’: range(2001, 2021),
‘restaurant’: np.random.choice([‘Pizza Paradise’, ‘Sushi Supreme’, ‘Taco Town’,
‘Burger Barn’, ‘Thai Terrace’], 20),
‘order_value’: np.random.uniform(15, 60, 20).round(2),
‘delivery_time’: np.random.randint(15, 45, 20),
‘rating’: np.random.choice([3, 4, 5], 20, p=[0.2, 0.3, 0.5]),
‘is_peak_hour’: np.random.choice([True, False], 20)
}
df_orders = pd.DataFrame(expanded_data)
# Filter high-value orders
high_value = df_orders[df_orders[‘order_value’] > 40]
print(f”High-value orders (>$40): {len(high_value)}“)
print(high_value[[‘restaurant’, ‘order_value’]])
# Filter by multiple conditions
fast_good = df_orders[
(df_orders[‘delivery_time’] < 25) &
(df_orders[‘rating’] >= 4)
]
print(f”\nFast & highly-rated deliveries: {len(fast_good)}“)
Code
In [6]:
CHALLENGE
Code
In [7]:
# Exercise 6: Grouping & Aggregations – Restaurant Performance Analysis
# Create comprehensive dataset
np.random.seed(42)
n_orders = 100
restaurants = [‘Pizza Paradise’, ‘Sushi Supreme’, ‘Taco Town’, ‘Burger Barn’, ‘Thai Terrace’]
full_data = {
‘order_id’: range(5001, 5001 + n_orders),
‘restaurant’: np.random.choice(restaurants, n_orders),
‘order_value’: np.random.uniform(10, 75, n_orders).round(2),
‘delivery_time’: np.random.normal(30, 8, n_orders).round().astype(int),
‘rating’: np.random.choice([1, 2, 3, 4, 5], n_orders, p=[0.05, 0.1, 0.2, 0.35, 0.3]),
‘day_of_week’: np.random.choice([‘Mon’, ‘Tue’, ‘Wed’, ‘Thu’, ‘Fri’, ‘Sat’, ‘Sun’], n_orders)
}
df_full = pd.DataFrame(full_data)
# Group by restaurant and analyze
restaurant_stats = df_full.groupby(‘restaurant’).agg({
‘order_value’: [‘mean’, ‘sum’, ‘count’],
‘delivery_time’: ‘mean’,
‘rating’: ‘mean’
}).round(2)
print(“š Restaurant Performance Dashboard”)
print(“=” * 50)
print(restaurant_stats)
# Find best performer
best_restaurant = restaurant_stats[(‘rating’, ‘mean’)].idxmax()
print(f”\nš Best rated restaurant: {best_restaurant}“)
Code
In [8]:
FINAL PROJECT
Markdown
šÆ Knowledge Check Quiz
Test your understanding of Python and Pandas concepts!
Question 1: Which data structure would be best for storing unique customer IDs?
List
Set
Tuple
Dictionary
Question 2: What Pandas method would you use to find missing values?
.dropna()
.isna()
.fillna()
.notna()
Markdown
š Congratulations!
You’ve completed the Python & Pandas Food Delivery Analytics exercise!
Key Skills Practiced:
ā Python data structures (lists, dicts, tuples, sets)
ā Pandas DataFrames and Series
ā Data cleaning and transformation
ā Grouping and aggregations
ā Building ETL pipelines
ā Best practices for analytics engineering
ā Python data structures (lists, dicts, tuples, sets)
ā Pandas DataFrames and Series
ā Data cleaning and transformation
ā Grouping and aggregations
ā Building ETL pipelines
ā Best practices for analytics engineering
Next Steps:
- Practice with larger datasets
- Explore advanced Pandas functions
- Learn about data visualization with matplotlib/seaborn
- Build automated reporting pipelines
- Integrate with SQL databases