This content originally appeared on DEV Community and was authored by Marat Latypov
Usually piece of data in SQL is represented as a row in a table. Often it's convenient to represent it as a cell in pivot table. Crosstab
could help us in it.
Let's see how it works.
Preparing data
First of all let's prepare data. Here I created tables a
and b
, and so-called join table c
, which is related to a
and b
.
create sequence myseq start 1;
create table a
(
id bigint default nextval('myseq'::regclass) not null primary key,
name text NOT NULL
);
create table b
(
id bigint default nextval('myseq'::regclass) not null primary key,
name text NOT NULL
);
create table c
(
a_id bigint not null references a(id),
b_id bigint not null references b(id),
value text
);
Now let's seed tables with data.
insert into a (name) values ('a1'), ('a2'), ('a3');
-- Result:
-- +--+----+
-- |id|name|
-- +--+----+
-- |1 |a1 |
-- |2 |a2 |
-- |3 |a3 |
-- +--+----+
insert into b (name) values ('b1'), ('b2'), ('b3');
-- Result:
-- +--+----+
-- |id|name|
-- +--+----+
-- |4 |b1 |
-- |5 |b2 |
-- |6 |b3 |
-- +--+----+
I used one shared increment sequence for both tables, which is why the id
in table b
starts from 4 instead of 1.
Just keep this in mind.
Next let's populate the join table c
.
insert into c (a_id, b_id, value) values
(1, 4, 'v14'),
(1, 5, 'v15'),
(1, 6, 'v16'),
(2, 4, 'v24'),
(2, 5, 'v25'),
(2, 6, 'v26'),
(3, 4, 'v34'),
(3, 5, 'v35'),
(3, 6, 'v36');
Case #0 Cartesian product
Let's make a simple inner join.
select
a.name as a_name,
b.name as b_name,
c.value
from c
inner join a on a.id = c.a_id
inner join b on b.id = c.b_id
order by 1, 2;
-- Result:
-- +------+------+-----+
-- |a_name|b_name|value|
-- +------+------+-----+
-- |a1 |b1 |v14 |
-- |a1 |b2 |v15 |
-- |a1 |b3 |v16 |
-- |a2 |b1 |v24 |
-- |a2 |b2 |v25 |
-- |a2 |b3 |v26 |
-- |a3 |b1 |v34 |
-- |a3 |b2 |v35 |
-- |a3 |b3 |v36 |
-- +------+------+-----+
In the output result we see all the data in rows. Each row is a piece of information.
There is nothing special here
Case #1 Our pivot table using crosstab
Now let's make a pivot table from it
select *
from crosstab(
'select
a.name as a_name,
b.name as b_name,
c.value
from c
inner join a on a.id = c.a_id
inner join b on b.id = c.b_id
order by 1',
'select name from b order by 1'
) as crosstab_alias (a_name text, b1 text, b2 text, b3 text);
-- Result:
-- +------+---+---+---+
-- |a_name|b1 |b2 |b3 |
-- +------+---+---+---+
-- |a1 |v14|v15|v16|
-- |a2 |v24|v25|v26|
-- |a3 |v34|v35|v36|
-- +------+---+---+---+
Wow! This result looks much better!
Let's look at the query
crosstab(text source_sql, text category_sql)
Crosstab
function produces table and have 2 parameters : source_sql
and category_sql
.
-
source_sql
- SQL used to retrieve the source data -
category_sql
- SQL used to retrieve categories
Source data has some restrictions:
- The first column must be the row identifier (in our example
a_name
). It should be unique for each row in pivot table. - After row identifier column could be some extra fields. In our example we don't have them.
- The next mandatory column is the category identifier (in our example
b_name
). It should be unique for each column in pivot table. - The last mandatory column is the value (in our example
value
). It can be any data type.
One more important thing is table alias and column aliases in our example it's the line
as crosstab_alias (a_name text, b1 text, b2 text, b3 text)
The alias crosstab_alias
is quite common thing in SELECT
queries. It's nothing special — aliases are often assigned to tables in queries for easier reference later on.
But in our case it's vital, because table produced by crosstab
should have some name!
The next mandatory thing is column description with types. So it should contain:
- row identifier (for ex.
a_name text
) - optional extra fields, if you have some
- categories columns (for ex.
b1 text, b2 text, b3 text
)
That's it!
Case #2 Not existent data in pivot table
Now let's play with the data. What if we don't have full table?
delete from c; --delete previous data
insert into c (a_id, b_id, value) values
(1, 4, 'v14'),
-- (1, 5, 'v15'), removed line
(1, 6, 'v16'),
(2, 4, 'v24'),
(2, 5, 'v25'),
-- (2, 6, 'v26'), removed line
-- (3, 4, 'v34'), removed line
(3, 5, 'v35'),
(3, 6, 'v36');
When we run our previous crosstab
query on such data, we will get:
-- Result:
-- +------+----+----+----+
-- |a_name|b1 |b2 |b3 |
-- +------+----+----+----+
-- |a1 |v14 |null|v16 |
-- |a2 |v24 |v25 |null|
-- |a3 |null|v35 |v36 |
-- +------+----+----+----+
So we see now empty slots with no data are replaced with nulls!
Looks great for me!
Case #3 Duplicated data in pivot table
delete from c; --delete previous data if you need
insert into c (a_id, b_id, value) values
(1, 4, 'v14'),
(1, 5, 'v15'),
(1, 6, 'v16'),
(1, 6, 'v16 duplicate'),
(2, 4, 'v24'),
(2, 5, 'v25'),
(2, 6, 'v26'),
(3, 4, 'v34'),
(3, 5, 'v35'),
(3, 6, 'v36');
What does mean this duplicated data entry in our case, it looks like a nonsense.
But assume we have something like that and let's see what will happen.
Our crosstab
query will return result:
-- Result:
-- +------+---+---+-------------+
-- |a_name|b1 |b2 |b3 |
-- +------+---+---+-------------+
-- |a1 |v14|v15|v16 duplicate|
-- |a2 |v24|v25|v26 |
-- |a3 |v34|v35|v36 |
-- +------+---+---+-------------+
Ouch! From two data entries for relation a1
-b3
only the latest is shown. Why the latest? No idea, I don't know, it's on PostgreSQL implementation. Maybe it uses the latest value, maybe chooses randomly.
Anyway it sounds bad.
You shouldn't have any duplicated entries in you table. Maybe entries should be aggregated somehow.
Links
Here is a link to the code used in this article:
This content originally appeared on DEV Community and was authored by Marat Latypov

Marat Latypov | Sciencx (2025-07-13T16:45:20+00:00) Understanding PostgreSQL crosstab. Retrieved from https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.