Before MySQL 8.0, comparing a row to its neighboring row required a self-join or a complex correlated subquery β both slow and hard to read. LEAD and LAG solve this in a single, clean pass.
LAG(column, offset, default): Returns the value from a column N rows behind the current row within the same window/partition. Looks "backward."LEAD(column, offset, default): Returns the value from a column N rows ahead of the current row. Looks "forward."
Parameters:
columnβ the column value to retrieveoffsetβ how many rows back/forward (default is 1)defaultβ value to return when there's no previous/next row (default is NULL)
Time-series and sequential analysis constantly requires comparing a value to its adjacent rows:
- "How much did revenue grow compared to yesterday?"
- "How long did this session last? (current logout - previous login)"
- "Did the stock price go up or down from the last trading day?"
- "What was the previous status of this order before it changed?"
Before LEAD/LAG, you'd do an ugly self-join like: ON t1.ID = t2.ID + 1 β which breaks whenever IDs have gaps.
LEAD and LAG don't perform a new search or join. They work on the already-sorted window partition in memory:
- MySQL sorts the rows within the partition (as specified by
ORDER BYinsideOVER()). - For each row, it moves a memory pointer N positions backward (LAG) or forward (LEAD).
- It reads the value at that pointer position.
Because it's just pointer arithmetic on an already-sorted result, LEAD and LAG are extremely fast β O(1) per row after the initial sort.
-- Basic LEAD and LAG
SELECT
Date,
Daily_Revenue,
LAG(Daily_Revenue) OVER(ORDER BY Date) AS Yesterday_Revenue,
LEAD(Daily_Revenue) OVER(ORDER BY Date) AS Tomorrow_Revenue
FROM Daily_Sales;
-- With explicit offset (look 2 rows back instead of 1)
SELECT
Quarter,
Revenue,
LAG(Revenue, 2) OVER(ORDER BY Quarter) AS Revenue_2_Quarters_Ago
FROM Quarterly_Revenue;
-- With default value (show 0 instead of NULL on first/last rows)
SELECT
Month,
Revenue,
LAG(Revenue, 1, 0) OVER(ORDER BY Month) AS Last_Month_Revenue,
Revenue - LAG(Revenue, 1, 0) OVER(ORDER BY Month) AS Growth_Amount
FROM Monthly_Sales;
-- With PARTITION BY: separate LAG per entity
SELECT
Store_ID,
Sale_Date,
Daily_Sales,
LAG(Daily_Sales) OVER(PARTITION BY Store_ID ORDER BY Sale_Date) AS Prev_Day_Sales,
Daily_Sales - LAG(Daily_Sales) OVER(PARTITION BY Store_ID ORDER BY Sale_Date) AS Day_Change
FROM Store_Sales;Example 1 β Month-over-Month Revenue Growth (Finance):
SELECT
Month_Year,
Revenue,
LAG(Revenue) OVER(ORDER BY Month_Year) AS Last_Month,
Revenue - LAG(Revenue) OVER(ORDER BY Month_Year) AS Dollar_Change,
ROUND(
(Revenue - LAG(Revenue) OVER(ORDER BY Month_Year)) /
NULLIF(LAG(Revenue) OVER(ORDER BY Month_Year), 0) * 100,
2) AS Pct_Change
FROM Monthly_Revenue
ORDER BY Month_Year;Example 2 β Stock Price Daily Change:
SELECT
Trade_Date,
Ticker,
Closing_Price,
LAG(Closing_Price) OVER(PARTITION BY Ticker ORDER BY Trade_Date) AS Prev_Close,
CASE
WHEN Closing_Price > LAG(Closing_Price) OVER(PARTITION BY Ticker ORDER BY Trade_Date) THEN 'π Up'
WHEN Closing_Price < LAG(Closing_Price) OVER(PARTITION BY Ticker ORDER BY Trade_Date) THEN 'π Down'
ELSE 'β‘οΈ Unchanged'
END AS Direction
FROM Stock_Prices;Example 3 β User Session Duration (LEAD for logout time):
SELECT
User_ID,
Event_Type,
Event_Time,
LEAD(Event_Time) OVER(PARTITION BY User_ID ORDER BY Event_Time) AS Next_Event_Time,
TIMESTAMPDIFF(MINUTE, Event_Time,
LEAD(Event_Time) OVER(PARTITION BY User_ID ORDER BY Event_Time)
) AS Minutes_Until_Next_Event
FROM User_Activity_Log
WHERE Event_Type IN ('Login', 'Logout');The first row has no "previous" row β LAG returns NULL.
The last row has no "next" row β LEAD returns NULL.
-- Without default: First row shows NULL for LAG
SELECT Date, Revenue, LAG(Revenue) OVER(ORDER BY Date) AS Prev;
-- Result: 2024-01-01 | 5000 | NULL β NULL because nothing before it
-- With default value 0:
SELECT Date, Revenue, LAG(Revenue, 1, 0) OVER(ORDER BY Date) AS Prev;
-- Result: 2024-01-01 | 5000 | 0 β 0 instead of NULL
-- Use COALESCE for custom formatting:
SELECT Date, Revenue, COALESCE(LAG(Revenue) OVER(ORDER BY Date), 'N/A (First Month)') AS Prev;- Missing PARTITION BY in multi-entity reports: Without
PARTITION BY Store_ID, LAG for Store B will look at the last row of Store A β completely incorrect data. Always partition when you have multiple entities. - Assuming ORDER BY and PARTITION match the outer query sort: The
ORDER BYinsideOVER()is independent of the query's finalORDER BY. They can be different. - Not handling NULLs from boundary rows: If you compute
Revenue - LAG(Revenue)and the first row returns NULL for LAG, the result is NULL (not 0). Use a default value or COALESCE.
- Index the column used in
OVER(ORDER BY ...)to avoid a full sort (FileSort) on large tables. - Use LEAD/LAG over Correlated Subqueries: The old way of "look at previous row" required
WHERE id = (SELECT MAX(id) FROM t WHERE id < current_id)β complex and O(NΒ²). LAG/LEAD is O(N log N) β dramatically faster. - Store LAG expressions in a CTE if you need to reuse the same value multiple times:
WITH Sales_With_Lag AS (
SELECT Date, Revenue, LAG(Revenue) OVER(ORDER BY Date) AS Prev_Revenue
FROM Monthly_Sales
)
SELECT Date, Revenue, Prev_Revenue,
Revenue - Prev_Revenue AS Change,
ROUND((Revenue - Prev_Revenue)/Prev_Revenue*100, 1) AS Pct
FROM Sales_With_Lag;- Task 1: Write a query that shows each
Order_DateandTotal_Amountfor a customer, alongside aPrevious_Order_Amountcolumn showing the amount from the immediately preceding order (for the SAME customer, usingPARTITION BY). - Task 2: What happens when
LAG()is used on the first row of a partition? How do you handle it gracefully? - Task 3: Why is
LAG(Revenue, 1, 0) OVER(...)better thanCOALESCE(LAG(Revenue) OVER(...), 0)in terms of readability? (They produce the same result β which is more idiomatic SQL?)