By the end of this lab, you will be able to: - Understand and use both Simple and Searched CASE expressions - Apply CASE expressions in SELECT, WHERE, HAVING, and ORDER BY clauses - Use CASE with aggregate functions for conditional counting and summing - Transform and categorize data using CASE expressions - Apply CASE expressions to solve real-world business problems
The CASE expression is one of SQL's most powerful and versatile tools. It allows you to implement conditional logic directly within your SQL queries, making your code more readable and often more efficient.
-- Simple CASE (comparing against a single column)
CASE column_name
WHEN value1 THEN result1
WHEN value2 THEN result2
ELSE default_result
END
-- Searched CASE (more flexible with conditions)
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
ELSE default_result
END
We'll work with an app_events table that tracks user interactions in a mobile application. This realistic scenario demonstrates how CASE expressions can analyze user behavior patterns.
-- Create database
DROP DATABASE IF EXISTS case_lab;
CREATE DATABASE case_lab;
USE case_lab;
-- Create app_events table
CREATE TABLE app_events (
event_id INT AUTO_INCREMENT PRIMARY KEY,
session_id INT NOT NULL,
user_id INT NOT NULL,
event_type VARCHAR(20) NOT NULL,
event_timestamp TIMESTAMP NOT NULL,
INDEX idx_session (session_id),
INDEX idx_user (user_id),
INDEX idx_type (event_type)
);
-- Insert sample data with various event types
INSERT INTO app_events (session_id, user_id, event_type, event_timestamp) VALUES
-- Session 1: Active session with lots of scrolling, few clicks, no purchase
(1, 101, 'scroll', '2024-01-15 10:00:00'),
(1, 101, 'scroll', '2024-01-15 10:05:00'),
(1, 101, 'scroll', '2024-01-15 10:10:00'),
(1, 101, 'scroll', '2024-01-15 10:15:00'),
(1, 101, 'scroll', '2024-01-15 10:20:00'),
(1, 101, 'scroll', '2024-01-15 10:25:00'),
(1, 101, 'scroll', '2024-01-15 10:30:00'),
(1, 101, 'click', '2024-01-15 10:35:00'),
(1, 101, 'scroll', '2024-01-15 10:40:00'),
-- Session 2: Short session, doesn't meet criteria
(2, 102, 'scroll', '2024-01-15 11:00:00'),
(2, 102, 'scroll', '2024-01-15 11:05:00'),
(2, 102, 'click', '2024-01-15 11:08:00'),
-- Session 3: Session with purchase (should be excluded)
(3, 103, 'scroll', '2024-01-15 12:00:00'),
(3, 103, 'scroll', '2024-01-15 12:10:00'),
(3, 103, 'scroll', '2024-01-15 12:20:00'),
(3, 103, 'scroll', '2024-01-15 12:30:00'),
(3, 103, 'scroll', '2024-01-15 12:40:00'),
(3, 103, 'scroll', '2024-01-15 12:50:00'),
(3, 103, 'click', '2024-01-15 12:55:00'),
(3, 103, 'purchase', '2024-01-15 13:00:00'),
-- Session 4: Long session with many scrolls, few clicks, no purchase
(4, 104, 'scroll', '2024-01-15 14:00:00'),
(4, 104, 'scroll', '2024-01-15 14:08:00'),
(4, 104, 'scroll', '2024-01-15 14:16:00'),
(4, 104, 'scroll', '2024-01-15 14:24:00'),
(4, 104, 'scroll', '2024-01-15 14:32:00'),
(4, 104, 'scroll', '2024-01-15 14:40:00'),
(4, 104, 'scroll', '2024-01-15 14:48:00'),
(4, 104, 'scroll', '2024-01-15 14:56:00'),
(4, 104, 'scroll', '2024-01-15 15:04:00'),
(4, 104, 'click', '2024-01-15 15:12:00'),
-- Session 5: High scroll-to-click ratio session
(5, 105, 'scroll', '2024-01-15 16:00:00'),
(5, 105, 'scroll', '2024-01-15 16:07:00'),
(5, 105, 'scroll', '2024-01-15 16:14:00'),
(5, 105, 'scroll', '2024-01-15 16:21:00'),
(5, 105, 'scroll', '2024-01-15 16:28:00'),
(5, 105, 'scroll', '2024-01-15 16:35:00'),
(5, 105, 'scroll', '2024-01-15 16:42:00'),
(5, 105, 'scroll', '2024-01-15 16:49:00'),
(5, 105, 'scroll', '2024-01-15 16:56:00'),
(5, 105, 'scroll', '2024-01-15 17:03:00'),
(5, 105, 'click', '2024-01-15 17:10:00');
Business Goal: Identify user sessions that indicate a poor user experience - users who scroll a lot but don't click much and don't make purchases. These sessions may indicate: - Confusing UI/UX - Irrelevant content - Poor product recommendations - Potential bugs or performance issues
CASE expressions allow us to conditionally count and filter events directly within the query. Let's break down the solution step by step:
-- Write your MySQL query statement below
SELECT
session_id,
user_id,
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) AS session_duration_minutes,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scroll_count
FROM app_events
GROUP BY session_id, user_id
HAVING session_duration_minutes > 30
AND scroll_count >= 5
AND COUNT(CASE WHEN event_type = 'click' THEN 1 END) / scroll_count < 0.2
AND COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) = 0
ORDER BY scroll_count DESC, session_id;
sql
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END)This is more readable than filtering with WHERE
Session Duration Calculation:
sql
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp))
Calculates the difference between first and last event in minutes
Click-to-Scroll Ratio:
sql
COUNT(CASE WHEN event_type = 'click' THEN 1 END) / scroll_count < 0.2
Divides click count by scroll count to get the ratio
No Purchase Condition:
sql
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) = 0
๐ก Note: For students interested in comparing CASE expressions with alternative query approaches like CTEs (Common Table Expressions), see the Query Patterns and Optimization lab in the advanced section.
Write a query to categorize sessions as 'short' (< 15 min), 'medium' (15-45 min), or 'long' (> 45 min):
-- Your solution here
SELECT
session_id,
user_id,
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) AS duration,
CASE
WHEN TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) < 15 THEN 'short'
WHEN TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) <= 45 THEN 'medium'
ELSE 'long'
END AS session_category
FROM app_events
GROUP BY session_id, user_id;
Write a query that shows for each session:
- Total events
- Number of scrolls
- Number of clicks
- Number of purchases
- An engagement score (scrolls * 1 + clicks * 3 + purchases * 10)
-- Your solution here
SELECT
session_id,
user_id,
COUNT(*) AS total_events,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scroll_count,
COUNT(CASE WHEN event_type = 'click' THEN 1 END) AS click_count,
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) AS purchase_count,
(COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) * 1 +
COUNT(CASE WHEN event_type = 'click' THEN 1 END) * 3 +
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) * 10) AS engagement_score
FROM app_events
GROUP BY session_id, user_id
ORDER BY engagement_score DESC;
Write a query to find sessions where: - If the session has any purchase, include it regardless of other criteria - If no purchase, only include if scroll_count >= 5 and duration > 20 minutes
-- Your solution here
SELECT
session_id,
user_id,
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) AS duration,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scroll_count,
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) AS purchase_count
FROM app_events
GROUP BY session_id, user_id
HAVING
CASE
WHEN COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) > 0 THEN 1
WHEN COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) >= 5
AND TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) > 20 THEN 1
ELSE 0
END = 1;
Write a query to classify users based on their total activity across all sessions: - 'Highly Active': >= 30 total events - 'Moderately Active': 15-29 total events - 'Low Activity': < 15 total events
Include user_id, total_events, and activity_level in the output.
-- Your solution here
SELECT
user_id,
COUNT(*) AS total_events,
CASE
WHEN COUNT(*) >= 30 THEN 'Highly Active'
WHEN COUNT(*) >= 15 THEN 'Moderately Active'
ELSE 'Low Activity'
END AS activity_level
FROM app_events
GROUP BY user_id
ORDER BY total_events DESC;
-- Calculate engagement score with different weights
SELECT
session_id,
user_id,
SUM(CASE
WHEN event_type = 'scroll' THEN 1
WHEN event_type = 'click' THEN 3
WHEN event_type = 'purchase' THEN 10
ELSE 0
END) AS weighted_engagement
FROM app_events
GROUP BY session_id, user_id;
-- Order by purchase first, then by scroll count
SELECT
session_id,
user_id,
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) AS purchases,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scrolls
FROM app_events
GROUP BY session_id, user_id
ORDER BY
CASE WHEN COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) > 0 THEN 0 ELSE 1 END,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) DESC;
In this lab, you learned: - How to use CASE expressions for conditional logic - The difference between Simple and Searched CASE - How CASE expressions work with aggregate functions for conditional counting - Using CASE in different query clauses (SELECT, WHERE, HAVING, ORDER BY) - Real-world applications of CASE expressions in data analysis - Performance considerations when using CASE expressions
Keep practicing! CASE expressions are fundamental to becoming proficient in SQL data analysis.