Window Functions in SQL

Zachary Greenberg
3 min readSep 24, 2021

In beginner SQL, the majority of queries are centered around returning a single piece of information, or a single piece of information per group. As we get more and more advanced, it becomes clear that by doing this, some of the information can get lost and that there is much more to be seen. The way to access more information in SQL is through the utilization of window functions. We must open a window to get this data, no pun intended.

‘A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result.’ — This is the explanation given from the PostgreSQL documentation.

So, like this explanation says, we can get more than just a single row as a result of our query. The prime examples of this involve running totals, rankings, and percentages.

The magic of a window function comes from the command: OVER().

I have created a simple sample table utilizing PostgreSQL for the examples. Here is the schema:

Screenshot of Table Schema created in PostgreSQL

Running Totals

A running total will give us a cumulative sum of a column. This will allow us to see how much a specific row can contribute to the total. In coding it works something like this:

SELECT flavor, SUM(cakes_sold) OVER (ORDER BY cakes_sold)
FROM cupcakes;

The code shown above will generate a running total of the number of cupcakes sold in the bakery. This will show you the contributions of the cake flavors to the overall sum of the cupcakes sold. Here is the output of that SELECT statement:

Rankings

Rankings will give us the rank of a particular group or column. Like a running total, we will be able to see the magnitude of each particular row. This will give us a ranking order of how the column’s contents performed. That explanation was probably a bit much, but lets see what this does in code:

SELECT flavor, RANK() OVER (ORDER BY cakes_sold)
FROM cupcakes;

To further clarify, here is what the output of this SELECT statement yields:

Because we ordered by cakes_sold, the top selling flavor will have the highest number in rank. This can be very useful when we have a lot a lot of data.

Percentiles

Percentiles are important because they will allow us to see a different part of the magnitude. We can see the percentiles that a row or group falls into.

SELECT flavor, CAST(PERCENT_RANK() OVER(ORDER BY cakes_sold) * 100 AS INTEGER) AS percent
FROM cupcakes;

This will show us which percentile a specific flavor’s sales falls into.

Based on this table, we can see the percentiles the number of sales fall into. We know that Red Velvet is in the highest percentile because it sold the most cupcakes, and Vanilla is in the lowest because it sold the least of the 4. Ideally this percentile would probably be better used for something like test scores.

To sum up, window functions in SQL help us to view more detailed information about our tables. Instead of getting a total number, we can get the contributions of all of the rows to the overall total number giving us more details to make our conclusions.

References

PostgreSQL documentation — https://www.postgresql.org/docs/9.1/tutorial-window.html

Window functions — https://mode.com/sql-tutorial/sql-window-functions/#row_number

--

--