Understanding PostgreSQL crosstab

Usually piece of data in SQL is represented as a row in a table. Often it’s convenient to represent it as a cell in pivot table. Crosstab could help us in it.

Let’s see how it works.

Preparing data

First of all let’s prepare data. Here I c…


This content originally appeared on DEV Community and was authored by Marat Latypov

Usually piece of data in SQL is represented as a row in a table. Often it's convenient to represent it as a cell in pivot table. Crosstab could help us in it.

Let's see how it works.

Preparing data

First of all let's prepare data. Here I created tables a and b, and so-called join table c, which is related to a and b.

create sequence myseq start 1;
create table a
(
    id   bigint default nextval('myseq'::regclass) not null primary key,
    name text NOT NULL
);

create table b
(
    id   bigint default nextval('myseq'::regclass) not null primary key,
    name text NOT NULL
);

create table c
(
    a_id bigint not null references a(id),
    b_id bigint not null references b(id),
    value text
);

Now let's seed tables with data.

insert into a (name) values ('a1'), ('a2'), ('a3');
-- Result:
-- +--+----+
-- |id|name|
-- +--+----+
-- |1 |a1  |
-- |2 |a2  |
-- |3 |a3  |
-- +--+----+

insert into b (name) values ('b1'), ('b2'), ('b3');
-- Result:
-- +--+----+
-- |id|name|
-- +--+----+
-- |4 |b1  |
-- |5 |b2  |
-- |6 |b3  |
-- +--+----+

I used one shared increment sequence for both tables, which is why the id in table b starts from 4 instead of 1.

Just keep this in mind.

Next let's populate the join table c.

insert into c (a_id, b_id, value) values
(1, 4, 'v14'),
(1, 5, 'v15'),
(1, 6, 'v16'),

(2, 4, 'v24'),
(2, 5, 'v25'),
(2, 6, 'v26'),

(3, 4, 'v34'),
(3, 5, 'v35'),
(3, 6, 'v36');

Case #0 Cartesian product

Let's make a simple inner join.

select 
    a.name as a_name, 
    b.name as b_name, 
    c.value
from c
inner join a on a.id = c.a_id
inner join b on b.id = c.b_id
order by 1, 2;
-- Result:
-- +------+------+-----+
-- |a_name|b_name|value|
-- +------+------+-----+
-- |a1    |b1    |v14  |
-- |a1    |b2    |v15  |
-- |a1    |b3    |v16  |
-- |a2    |b1    |v24  |
-- |a2    |b2    |v25  |
-- |a2    |b3    |v26  |
-- |a3    |b1    |v34  |
-- |a3    |b2    |v35  |
-- |a3    |b3    |v36  |
-- +------+------+-----+

In the output result we see all the data in rows. Each row is a piece of information.

There is nothing special here

Case #1 Our pivot table using crosstab

Now let's make a pivot table from it

select *
from crosstab(
    'select
        a.name as a_name,
        b.name as b_name,
        c.value
    from c
    inner join a on a.id = c.a_id
    inner join b on b.id = c.b_id
    order by 1',
   'select name from b order by 1'
) as crosstab_alias (a_name text, b1 text, b2 text, b3 text);

-- Result:
-- +------+---+---+---+
-- |a_name|b1 |b2 |b3 |
-- +------+---+---+---+
-- |a1    |v14|v15|v16|
-- |a2    |v24|v25|v26|
-- |a3    |v34|v35|v36|
-- +------+---+---+---+

Wow! This result looks much better!

Let's look at the query

crosstab(text source_sql, text category_sql)

Crosstab function produces table and have 2 parameters : source_sql and category_sql.

  • source_sql - SQL used to retrieve the source data
  • category_sql - SQL used to retrieve categories

Source data has some restrictions:

  • The first column must be the row identifier (in our example a_name). It should be unique for each row in pivot table.
  • After row identifier column could be some extra fields. In our example we don't have them.
  • The next mandatory column is the category identifier (in our example b_name). It should be unique for each column in pivot table.
  • The last mandatory column is the value (in our example value). It can be any data type.

One more important thing is table alias and column aliases in our example it's the line

as crosstab_alias (a_name text, b1 text, b2 text, b3 text)

The alias crosstab_alias is quite common thing in SELECT queries. It's nothing special — aliases are often assigned to tables in queries for easier reference later on.

But in our case it's vital, because table produced by crosstab should have some name!

The next mandatory thing is column description with types. So it should contain:

  • row identifier (for ex. a_name text)
  • optional extra fields, if you have some
  • categories columns (for ex. b1 text, b2 text, b3 text)

That's it!

Case #2 Not existent data in pivot table

Now let's play with the data. What if we don't have full table?

delete from c; --delete previous data
insert into c (a_id, b_id, value) values 
   (1, 4, 'v14'),
-- (1, 5, 'v15'), removed line
   (1, 6, 'v16'),
   (2, 4, 'v24'),
   (2, 5, 'v25'),
-- (2, 6, 'v26'), removed line
-- (3, 4, 'v34'), removed line
   (3, 5, 'v35'),
   (3, 6, 'v36');

When we run our previous crosstab query on such data, we will get:

-- Result:
-- +------+----+----+----+
-- |a_name|b1  |b2  |b3  |
-- +------+----+----+----+
-- |a1    |v14 |null|v16 |
-- |a2    |v24 |v25 |null|
-- |a3    |null|v35 |v36 |
-- +------+----+----+----+

So we see now empty slots with no data are replaced with nulls!

Looks great for me!

Case #3 Duplicated data in pivot table

delete from c; --delete previous data if you need
insert into c (a_id, b_id, value) values
       (1, 4, 'v14'),
       (1, 5, 'v15'),
       (1, 6, 'v16'),
       (1, 6, 'v16 duplicate'),
       (2, 4, 'v24'),
       (2, 5, 'v25'),
       (2, 6, 'v26'),
       (3, 4, 'v34'),
       (3, 5, 'v35'),
       (3, 6, 'v36');

What does mean this duplicated data entry in our case, it looks like a nonsense.

But assume we have something like that and let's see what will happen.

Our crosstab query will return result:

-- Result:
-- +------+---+---+-------------+
-- |a_name|b1 |b2 |b3           |
-- +------+---+---+-------------+
-- |a1    |v14|v15|v16 duplicate|
-- |a2    |v24|v25|v26          |
-- |a3    |v34|v35|v36          |
-- +------+---+---+-------------+

Ouch! From two data entries for relation a1-b3 only the latest is shown. Why the latest? No idea, I don't know, it's on PostgreSQL implementation. Maybe it uses the latest value, maybe chooses randomly.
Anyway it sounds bad.

You shouldn't have any duplicated entries in you table. Maybe entries should be aggregated somehow.

Links

Here is a link to the code used in this article:


This content originally appeared on DEV Community and was authored by Marat Latypov


Print Share Comment Cite Upload Translate Updates
APA

Marat Latypov | Sciencx (2025-07-13T16:45:20+00:00) Understanding PostgreSQL crosstab. Retrieved from https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/

MLA
" » Understanding PostgreSQL crosstab." Marat Latypov | Sciencx - Sunday July 13, 2025, https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/
HARVARD
Marat Latypov | Sciencx Sunday July 13, 2025 » Understanding PostgreSQL crosstab., viewed ,<https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/>
VANCOUVER
Marat Latypov | Sciencx - » Understanding PostgreSQL crosstab. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/
CHICAGO
" » Understanding PostgreSQL crosstab." Marat Latypov | Sciencx - Accessed . https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/
IEEE
" » Understanding PostgreSQL crosstab." Marat Latypov | Sciencx [Online]. Available: https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/. [Accessed: ]
rf:citation
» Understanding PostgreSQL crosstab | Marat Latypov | Sciencx | https://www.scien.cx/2025/07/13/understanding-postgresql-crosstab/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.