By the end of this lab, you will be able to: - Understand different approaches to solving the same SQL problem - Compare CASE expressions with CTE (Common Table Expression) approaches - Analyze trade-offs between readability, performance, and maintainability - Choose the appropriate query pattern based on context and requirements - Debug complex queries by breaking them down into logical steps
In SQL, there are often multiple valid ways to solve the same problem. While all approaches may produce correct results, they can differ significantly in:
This lab explores these trade-offs by examining the same problem solved with different query patterns.
Note: This lab uses the same dataset from the CASE Expressions lab. If you've already created the `case_lab` database, you can skip the setup and jump to the comparisons.
-- You can use the same database from Lab 06
USE case_lab;
-- Verify data exists
SELECT COUNT(*) FROM app_events;
Find user sessions that indicate a poor user experience - sessions where users:
SELECT
session_id,
user_id,
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) AS session_duration_minutes,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scroll_count,
COUNT(CASE WHEN event_type = 'click' THEN 1 END) AS click_count
FROM app_events
GROUP BY session_id, user_id
HAVING session_duration_minutes > 30
AND scroll_count >= 5
AND click_count / scroll_count < 0.2
AND COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) = 0
ORDER BY scroll_count DESC, session_id;
Strengths:
Weaknesses:
WITH A AS (
-- Step 1: Count events by type for each session
SELECT session_id, user_id, event_type, COUNT(*) AS c
FROM app_events
GROUP BY session_id, user_id, event_type
),
E AS (
-- Step 2: Ensure all sessions have a 'click' row (even if count is 0)
SELECT * FROM A
UNION
SELECT session_id, user_id, 'click', 0
FROM A
WHERE session_id NOT IN (
SELECT session_id FROM A WHERE event_type = 'click'
)
),
T AS (
-- Step 3: Calculate session duration in seconds
SELECT
session_id,
UNIX_TIMESTAMP(MAX(event_timestamp)) - UNIX_TIMESTAMP(MIN(event_timestamp)) AS d
FROM app_events
GROUP BY session_id
),
TT AS (
-- Step 4: Join everything together
SELECT
T.session_id,
E_c.user_id,
E_s.c AS scroll_count,
E_c.c AS clicks,
T.d AS duration
FROM E E_c
JOIN E E_s ON E_c.session_id = E_s.session_id
JOIN T ON E_c.session_id = T.session_id
WHERE E_s.event_type = 'scroll'
AND E_c.event_type = 'click'
)
-- Step 5: Apply filters and format output
SELECT
session_id,
user_id,
duration/60 AS session_duration_minutes,
scroll_count,
clicks AS click_count
FROM TT
WHERE session_id NOT IN (
SELECT session_id FROM E WHERE event_type = 'purchase'
)
AND duration > 30*60
AND scroll_count >= 5
AND clicks/scroll_count < 0.2
ORDER BY scroll_count DESC, session_id;
Strengths:
Weaknesses:
| Aspect | CASE Expression | CTE Approach |
|---|---|---|
| Lines of Code | ~10 lines | ~45 lines |
| Performance | โก Better - single scan | โ ๏ธ Potentially slower - multiple operations |
| Readability (Experienced developers) | โก High - compact and clear | โ Good - explicit steps |
| Readability (Beginners) | โ ๏ธ Requires CASE knowledge | โก Better - step-by-step logic |
| Debugging | โ ๏ธ Harder - can't inspect intermediate steps easily | โก Easier - query each CTE separately |
| Maintainability | โก Better - less code to maintain | โ ๏ธ More complex - multiple dependencies |
| Production Use | โก Preferred - efficient and concise | โ ๏ธ Use cautiously - verify performance |
| Learning/Teaching | โ Good for teaching conditional aggregation | โก Excellent for showing query decomposition |
Often, the best solution combines techniques strategically:
-- Use CTE for complex calculation, CASE for conditional aggregation
WITH session_stats AS (
SELECT
session_id,
user_id,
TIMESTAMPDIFF(MINUTE, MIN(event_timestamp), MAX(event_timestamp)) AS duration_minutes,
COUNT(CASE WHEN event_type = 'scroll' THEN 1 END) AS scrolls,
COUNT(CASE WHEN event_type = 'click' THEN 1 END) AS clicks,
COUNT(CASE WHEN event_type = 'purchase' THEN 1 END) AS purchases
FROM app_events
GROUP BY session_id, user_id
)
SELECT *
FROM session_stats
WHERE duration_minutes > 30
AND scrolls >= 5
AND clicks / scrolls < 0.2
AND purchases = 0
ORDER BY scrolls DESC;
Benefits:
For each scenario below, decide whether to use CASE, CTEs, or a hybrid approach. Justify your choice.
In this lab, you compared CASE expressions with CTE approaches for the same problem. You learned that:
Remember: The "best" query is the one that's correct, maintainable, and performant enough for your use case!