SQL Window Functions in Django
Javier Ayres
November 8, 2016
What are Window Functions?
Window functions can be seen as close relatives to our well-known aggregate functions. They perform a calculation over a set of rows and return a single value. Unlike aggregate functions, they do not group the input rows into a single output row, thus keeping the original information intact.
Window functions are part of the standard since version SQL:2003.
Setting up a test environment
Start your favorite editor with SQL support (make sure your RDBMS supports window functions) and create a table with some random data so we can go through all the subjects with working examples. We'll be using a table that represents connections from users to our system. Create a new database with a table as follows:
CREATE TABLE connections (
id INT,
user_id INT,
source_ip VARCHAR(15),
connection_timestamp TIMESTAMP,
bytes_transferred INT
)
Now populate it with:
INSERT INTO connections VALUES
(1, 1, '192.168.1.100', '2016-10-01T13:30:00', 15720000),
(2, 1, '192.168.1.100', '2016-10-01T00:40:16', 19500000),
(3, 2, '192.168.1.100', '2016-10-01T19:09:48', 9840000),
(4, 3, '192.168.1.200', '2016-10-01T22:40:58', 10000),
(5, 3, '192.168.1.200', '2016-10-01T13:15:03', 20310000),
(6, 3, '192.168.1.200', '2016-10-01T10:10:48', 2010000),
(7, 4, '192.168.1.200', '2016-10-01T23:25:21', 20310000),
(8, 4, '192.168.1.200', '2016-10-01T04:06:49', 810000),
(9, 1, '192.168.1.200', '2016-10-01T16:07:10', 91280000)
How to use them
Window functions are declared just as an aggregate function followed by an OVER
clause, which indicates exactly how rows are being grouped, plus a few other details that we'll dive into later.
We could query the average of bytes transferred per source IP across the connections from our table like this:
SELECT *, AVG(bytes_transferred) OVER (PARTITION BY source_ip) FROM connections
...which results in
id | user_id | source_ip | connection_timestamp | bytes_transferred | avg |
---|---|---|---|---|---|
1 | 1 | 192.168.1.100 | 2016-10-01 13:30:00.000000 | 15720000 | 15020000 |
2 | 1 | 192.168.1.100 | 2016-10-01 00:40:16.000000 | 19500000 | 15020000 |
3 | 2 | 192.168.1.100 | 2016-10-01 19:09:48.000000 | 9840000 | 15020000 |
4 | 3 | 192.168.1.200 | 2016-10-01 22:40:58.000000 | 10000 | 22455000 |
5 | 3 | 192.168.1.200 | 2016-10-01 13:15:03.000000 | 20310000 | 22455000 |
6 | 3 | 192.168.1.200 | 2016-10-01 10:10:48.000000 | 2010000 | 22455000 |
7 | 4 | 192.168.1.200 | 2016-10-01 23:25:21.000000 | 20310000 | 22455000 |
8 | 4 | 192.168.1.200 | 2016-10-01 04:06:49.000000 | 810000 | 22455000 |
9 | 1 | 192.168.1.200 | 2016-10-01 16:07:10.000000 | 91280000 | 22455000 |
Notice that we still have the original 9 rows with the result added as a new column. Our window function didn't alter the original input at all.
Also notice that we are using the good old AVG
function here. This is because all existing aggregate functions can be used as window functions. There are, though, a few new functions that can only be used in the latter form.
PARTITION
ing vs GROUP
ing
The group of rows onto which the window function is applied is called "partition". In its basic form, a partition is no different than a normal aggregate function group: simply a set of rows considered "equal" by some criteria, and the function will be applied over all these rows to return a single result. However, a window function is not executed once for each partition, but rather for each row, and the result may vary from row to row...
Enter the "window frame"
If my row belongs to a partition that doesn't change and the window function executes across all the rows within the partition, how can the result vary?
It can vary because the first part isn't totally true; the window function actually executes over a subset of the partition called "window frame". In our previous example the window frame was equal to the partition so this behavior was not exhibited, but we can alter the window frame like this:
SELECT *, AVG(bytes_transferred) OVER (PARTITION BY source_ip ORDER BY bytes_transferred) FROM connections
id | user_id | source_ip | connection_timestamp | bytes_transferred | avg |
---|---|---|---|---|---|
3 | 2 | 192.168.1.100 | 2016-10-01 19:09:48.000000 | 9840000 | 9840000 |
1 | 1 | 192.168.1.100 | 2016-10-01 13:30:00.000000 | 15720000 | 12780000 |
2 | 1 | 192.168.1.100 | 2016-10-01 00:40:16.000000 | 19500000 | 15020000 |
4 | 3 | 192.168.1.200 | 2016-10-01 22:40:58.000000 | 10000 | 10000 |
8 | 4 | 192.168.1.200 | 2016-10-01 04:06:49.000000 | 810000 | 410000 |
6 | 3 | 192.168.1.200 | 2016-10-01 10:10:48.000000 | 2010000 | 943333.333333333333 |
5 | 3 | 192.168.1.200 | 2016-10-01 13:15:03.000000 | 20310000 | 8690000 |
7 | 4 | 192.168.1.200 | 2016-10-01 23:25:21.000000 | 20310000 | 8690000 |
9 | 1 | 192.168.1.200 | 2016-10-01 16:07:10.000000 | 91280000 | 22455000 |
Now the avg
column shows different results for rows of the same partition because the window frame is different for each row. When using ORDER BY
, the window frame contains all the rows from the start of the partition to our current row, including all subsequent rows that are equal to the current one according to the ORDER BY
clause. Remember this behavior, else you will probably end up with results that don't seem to make sense. Of course, once you grasp the concept of the window frame you'll realize it's not a "strange behavior" but a powerful feature that will allow you to precisely define what rows a window function can see.
For example, let's look at the first 2 rows: we know the partition is defined by the source_ip
field, so the first 3 rows belong to it. In the first row, the window frame consists of only that row, so the average is equal to that row's bytes_transferred
. In the second row, the window frame goes from the first row to the current row, so the average is calculated with both their values.
Now take a look at row #7: the window frame here goes from row #4 (the first of this partition) to row #8, which is past our current row, because that row's bytes_transferred
is equal to the current row's according to the ORDER BY
criteria. This results in rows #7 and #8 having the same window frame, and thus the same avg
.
The window frame can be defined with a "frame clause" as well, but we won't dive into its details here. Postgresql documentation has a great explanation of how it works. If neither this nor the ORDER BY
clause is used, the window frame is the whole partition.
Window functions execution order
Window functions execute after WHERE
, HAVING
and GROUP BY
clauses, meaning that we can't use the results of window functions in any of these (without resorting to sub-queries) and also that these constructs will affect the input rows that end up forming our partitions.
Aggregate functions are also executed before window functions, which means that we can't use the results of a window function inside an aggregate function but we can do it the other way around.
Window functions vs typical SQL queries
Say we want to fetch the last connection from each user including all the columns of the table, we have a few alternatives before resorting to window functions. Let's review them:
Alternative 1: sub-query with aggregate function
Here we use a sub-query which returns the maximum connection_timestamp
from the current user and compares it against the current row's connection_timestamp
to check if they're equal. The drawback is that the sub-query is executed once per row, causing considerable performance issues when dealing with larger numbers of rows (~20s
in my PostgreSQL setup with 13k rows vs ~35ms
using the solution at the end with window functions).
SELECT * FROM connections WHERE connection_timestamp = (
SELECT max(connection_timestamp) FROM connections AS sub
WHERE sub.user_id = connections.user_id
)
Alternative 2: joining the table with itself
Now we join the table against itself on the condition that the user_id
is the same on both rows and that the row "on the left" has a smaller connection_timestamp
than the one "on the right". This query returns all connections for which another more recent connection by the same user exists. Finally, we select all the connections that don't exist in that set, and we have what we want.
The downside with this approach is readability (you really have to look into it to figure what it does) and the inability to fetch the "N" latest connections dynamically, as it would require to join the table against itself as many times as "latest connections" we want.
SELECT * FROM connections WHERE id NOT IN (
SELECT c1.id
FROM connections AS c1
JOIN connections AS c2 USING (user_id)
WHERE c1.connection_timestamp < c2.connection_timestamp
)
With window functions
Finally, with the help of the row_number
function and a sub-query, we can achieve our goal like this:
SELECT * FROM (
SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY connection_timestamp DESC)
FROM connections
) AS sub WHERE row_number = 1
The row_number
window function simply returns the index of the current row within its partition starting from 1, so we are asking for all the connections that are first when ordered by the larger connection_timestamp
, partitioning by user_id
. Since we can't use this row_number
inside the WHERE
clause, we wrap the whole thing in a sub-query and do the filtering in the outer-query.
Django is in the title but you don't talk about it
Right! Well, window functions have the advantage that their syntax is contained in a single statement that goes in the SELECT
list where any other column would go (or in the ORDER BY
clause, where they are also allowed), so we don't need any special support to use them with Django's ORM.
A simple annotate
call with a RawSQL
expression is enough to make use of window functions.
from django.db.models.expressions import RawSQL
from connections.models import Connection
Connection.objects.annotate(row_number=RawSQL(
'row_number() OVER (PARTITION BY user_id ORDER BY connection_timestamp DESC)',
[]
))
Further reading
Gulp and commit hooks (gilp)
When creating a validation pipeline for your staged files, you don't have too many choices ending up with a huge bash script with countless functions for every linter, rule, or standard you want to automate.
Combining QuerySets in Django
In this article we'll demonstrate several approaches to merging QuerySets either with the same or different models in Django without relying on raw queries.
Photo by Cristopher Gower.
Categorized under sql / django / talks.Join our team
If you're passionate about building quality software and our values resonate with you, get in touch with us!