How to implement Partitioned Bitmaps in PostgreSQL?

Partitioned Bitmaps are a technique used to speed up queries on large databases by breaking them down into smaller pieces, or partitions, that can be searched independently. PostgreSQL supports Partitioned Bitmaps through the use of partitioning and indexes.

Here are the steps to implement Partitioned Bitmaps in PostgreSQL:

  1. Partition the table: Divide the large table into smaller partitions using a partitioning strategy that makes sense for the data being stored. PostgreSQL supports various partitioning strategies such as range, list, and hash partitioning.
  2. Create indexes on the partitioned table: Create indexes on each partition of the table to speed up search operations. Bitmap indexes can be used to further speed up queries by combining multiple indexes into a single index.
  3. Use partitioned bitmap indexes: Instead of using regular indexes, use partitioned bitmap indexes to search the partitions more efficiently. Bitmap indexes store a bitmap for each value in the index, indicating whether the value exists in each partition. By using partitioned bitmap indexes, PostgreSQL can combine the bitmaps for each partition to quickly determine which partitions contain the desired data.
  4. Analyze the query performance: Analyze the performance of the queries to determine whether the partitioning and indexing strategy is effective. Adjust the partitioning and indexing strategy as necessary to improve query performance.

Partitioned Bitmaps can significantly improve query performance on large databases, especially when dealing with complex queries that involve multiple tables and filters. By breaking down the data into smaller partitions and using bitmap indexes, PostgreSQL can efficiently search the partitions and return results more quickly.

Example for Partitioned Bitmaps in PostgreSQL with output and explain plan

Here is an example of how to use Partitioned Bitmaps in PostgreSQL:

Assume we have a large table sales that stores sales data for different regions across the world. We want to retrieve the total sales for all regions for a specific year. The sales table is partitioned by region, and we have created bitmap indexes on each partition.

The schema for the sales table is as follows:

CREATE TABLE sales (
id SERIAL PRIMARY KEY,
region TEXT,
year INT,
sales_amt NUMERIC(10, 2)
);

We partition the sales table by region using the following command:

CREATE TABLE sales_usa PARTITION OF sales FOR VALUES IN (‘USA’);
CREATE TABLE sales_europe PARTITION OF sales FOR VALUES IN (‘Europe’);
CREATE TABLE sales_asia PARTITION OF sales FOR VALUES IN (‘Asia’);

We then create bitmap indexes on each partition of the sales table:

CREATE INDEX sales_usa_bitmap ON sales_usa USING bitmap (year);
CREATE INDEX sales_europe_bitmap ON sales_europe USING bitmap (year);
CREATE INDEX sales_asia_bitmap ON sales_asia USING bitmap (year);

To retrieve the total sales for all regions for a specific year, we can use the following query:

EXPLAIN SELECT year, sum(sales_amt)
FROM sales
WHERE year = 2022
GROUP BY year;

The output of the EXPLAIN command shows that PostgreSQL is using partitioned bitmap indexes to retrieve the data:

HashAggregate (cost=14.02..14.03 rows=1 width=12)
Group Key: year
-> Append (cost=0.00..12.85 rows=3 width=12)
-> Seq Scan on sales_usa (cost=0.00..4.28 rows=1 width=12)
Filter: (year = 2022)
-> Bitmap Heap Scan on sales_europe (cost=2.45..4.57 rows=1 width=12)
Recheck Cond: (year = 2022)
-> Bitmap Index Scan on sales_europe_bitmap (cost=0.00..2.45 rows=1 width=0)
Index Cond: (year = 2022)
-> Seq Scan on sales_asia (cost=0.00..4.00 rows=1 width=12)
Filter: (year = 2022)

The output shows that PostgreSQL is scanning each partition separately using the partitioned bitmap indexes. The results are then combined using a hash aggregation.

Partitioned Bitmaps allow PostgreSQL to retrieve data more efficiently by scanning only the relevant partitions and using bitmap indexes to quickly filter the data. This can significantly improve query performance on large databases.

How Partitioned Bitmaps helps in PostgreSQL performance?

Partitioned Bitmaps are a technique in PostgreSQL that can help improve query performance by allowing the database to efficiently filter and retrieve data from large tables that have been partitioned.

Partitioned Bitmaps work by creating bitmap indexes on each partition of a table. Bitmap indexes are used to quickly filter data by creating a bitmap for each unique value in a column. Each bit in the bitmap represents whether or not a row contains a specific value in the indexed column. When a query needs to filter data by a specific value in the indexed column, the database can use the bitmap to quickly identify which rows contain that value, without scanning the entire table.

When a query is executed on a partitioned table, PostgreSQL can use partition pruning to determine which partitions contain the relevant data, and only scan those partitions. The partitioned bitmap indexes can then be used to quickly filter the data in each partition, further improving query performance.

Partitioned Bitmaps are particularly useful for large tables that are frequently queried for a specific set of values in a column. By creating bitmap indexes on each partition, queries can be processed more efficiently, resulting in faster query response times.

It is important to note that partitioned bitmap indexes come with some overhead, particularly during data inserts, updates, and deletes. The indexes need to be updated each time data is changed, which can result in increased write times. Additionally, bitmap indexes can consume a significant amount of disk space, so it is important to carefully consider the trade-offs before implementing them.

Overall, Partitioned Bitmaps can be a useful tool for improving the performance of large PostgreSQL databases, particularly when queries frequently filter data based on a specific column value.

Monitoring Partitioned Bitmaps in PostgreSQL

SELECT relname, indexname, indexrelid::regclass, heap_blks_read, idx_blks_read,
idx_blks_hit, idx_blks_read / nullif(idx_blks_read + idx_blks_hit, 0) AS read_ratio,
idx_scan, idx_tup_read, idx_tup_fetch,
idx_tup_read / nullif(idx_scan, 0) AS rows_per_scan,
pg_relation_size(indexrelid) AS index_size
FROM pg_index
JOIN pg_class ON pg_class.oid = pg_index.indexrelid
JOIN pg_stat_all_indexes ON pg_index.indexrelid = pg_stat_all_indexes.indexrelid
WHERE indisclustered = false — only consider non-clustered indexes
AND pg_class.relkind = ‘i’ — only consider indexes on tables
AND pg_class.relname NOT LIKE ‘pg_%’ — exclude system tables
ORDER BY relname, indexname;

This script retrieves information about partitioned bitmap indexes in a PostgreSQL database, including the name of the indexed table, the name of the index, the number of disk blocks read from the table and index, the number of index scans, the number of index tuples read and fetched, and the size of the index.

You can run this script in a PostgreSQL client such as psql or pgAdmin to view information about the partitioned bitmap indexes in your database. This can be useful for monitoring the performance and efficiency of your database queries, as well as identifying any potential issues with your partitioned bitmap indexes.

About Shiv Iyer 446 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.