Index Wildcard in PostgreSQL

Index wildcard is a technique used in PostgreSQL to create an index on a partial string in a column, rather than indexing the entire column. This technique can improve query performance when searching for data based on a partial string.

For example, suppose you have a table customers with columns id, first_name, and last_name, and you frequently need to search for customers by their last name. In this case, you can create a partial index on the last_name column using the index wildcard technique.

To create an index using the index wildcard technique, you can use the LIKE operator with a wildcard character (%) to match any string starting with a given substring. For example, to create an index on the last_name column for all values starting with “Smi”, you can use the following SQL statement:

CREATE INDEX idx_customers_last_name ON customers (last_name) WHERE last_name LIKE ‘Smi%’;

QUERY PLAN 
———————————————————————————–
Bitmap Heap Scan on customers (cost=10.00..160.00 rows=50 width=10)
Recheck Cond: ((last_name)::text ‘Smi%’::text) -> Bitmap Index Scan on idx_customers_last_name (cost=0.00..10.00 rows=50 width=0) Index Cond: ((last_name)::text ‘Smi%’::text)

This statement creates an index only for values in the last_name column that start with “Smi”. Any queries that search for customers based on their last name starting with “Smi” can then use this index, which can improve query performance.

Another example use case for index wildcard is a table that contains log messages with timestamps. You might want to search for log messages within a specific time range. In this case, you can create a partial index on the timestamp column for log messages within that range.

CREATE INDEX idx_logs_timestamp ON logs (timestamp) WHERE timestamp BETWEEN ‘2022-07-01’ AND ‘2022-07-31’;

QUERY PLAN 
———————————————————————————–
Create Index (cost=0.00..22.95 rows=1075 width=0)
Index Cond: ((timestamp >= ‘2022-07-01 00:00:00’::timestamp without time zone) AND (timestamp <= ‘2022-07-31 00:00:00’::timestamp without time zone))
Filter: true

This statement creates an index only for log messages with timestamps in the month of July 2022. Any queries that search for log messages within this time range can then use this index, which can improve query performance.

It’s important to note that creating an index using the index wildcard technique can increase the size of the index and slow down write operations. Therefore, it’s recommended to use this technique only on columns that are frequently searched and have a large number of unique values.

In summary, index wildcard is a powerful technique for optimizing query performance in PostgreSQL by creating partial indexes on columns based on a substring search. It can be used in various real-life data examples, including searching for customers by their last name or searching for log messages within a specific time range.

About Shiv Iyer 456 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.