Reusable Window Functions
deltalake
Decembricks
Learn how to define and reuse window specifications across multiple window functions using named windows in Databricks SQL and PySpark.
Summary
- Named window specifications allow you to define a window once and reference it in multiple window functions
- This improves code readability and reduces duplication when using the same window definition repeatedly
- Supported in both Databricks SQL and PySpark DataFrame API
Detail
Window functions are a powerful tool for performing calculations across a set of rows related to the current row. However, when you need to apply the same window definition to multiple functions, the query can quickly become verbose and hard to read.
Named window specifications solve this by allowing you to declare the window spec once and reference it by name. Here’s how it works:
Syntax:
AS window_spec } [, ...] WINDOW { window_name
Where:
window_name
is an identifier to reference the window specwindow_spec
is the window definition to share across functions
Example:
SELECT product,
SUM(sales) OVER monthly_window AS monthly_sales,
AVG(sales) OVER monthly_window AS avg_monthly_sales
FROM sales_data
AS (PARTITION BY product ORDER BY date) WINDOW monthly_window
from pyspark.sql import Window
from pyspark.sql.functions import *
= Window.partitionBy("product").orderBy("date")
window
= sales_data.select(
sales_data "product",
sum("sales").over(window).alias("monthly_sales"),
"sales").over(window).alias("avg_monthly_sales")
avg( )
Benefits
Why use named windows?
- Makes queries with multiple window functions more concise and readable
- Allows easy reuse of complex window definitions
- Helps avoid errors from repeating the same window spec multiple times