What do we look for in a PostgreSQL execution plan?

PostgreSQL execution plans provide information on how the database intends to execute a given query. Understanding execution plans is crucial to identify performance issues and optimize query performance. Here are some of the things to look for in a PostgreSQL execution plan:

  1. Scan types: The execution plan shows what type of scan the database is performing, such as a sequential scan, index scan, or bitmap index scan. Sequential scans can be slow for large tables, while index scans are usually faster.
  2. Index usage: The execution plan shows whether the database is using an index to perform the query. If not, it may be worth adding an index to improve query performance.
  3. Join types: The execution plan shows the type of join being used, such as nested loop, hash join, or merge join. Different join types have different performance characteristics, and it’s important to choose the right join type for the query.
  4. Sorts and Hashes: The execution plan shows whether the database is using sort or hash operations to perform the query. Sort operations can be expensive for large datasets, so it’s important to minimize them where possible.
  5. Parallelism: The execution plan shows whether the database is using parallel query processing to perform the query. Parallelism can speed up queries on multi-core machines, but it also has overhead and can be inefficient for small queries.
  6. Cost estimates: The execution plan shows cost estimates for each operation, which can help identify performance bottlenecks and optimize the query.

Here’s an example of an execution plan and what to look for:

EXPLAIN ANALYZE SELECT * FROM orders WHERE order_date >= ‘2022-01-01’;
Seq Scan on orders (cost=0.00..41.84 rows=211 width=31) (actual time=0.015..0.017 rows=20 loops=1)
Filter: (order_date >= ‘2022-01-01’::date)
Rows Removed by Filter: 980
Planning time: 0.080 ms
Execution time: 0.038 ms

In this execution plan, the database is performing a sequential scan on the orders table to retrieve rows where the order_date is greater than or equal to ‘2022-01-01’. The query is expected to return 211 rows based on cost estimates, but only 20 rows were returned in actual execution. The Filter line shows the filter condition, which can be useful for identifying queries that are not using indexes efficiently. The Planning time and Execution time lines show the time taken to plan and execute the query, respectively.

By analyzing the execution plan, you can identify potential performance issues and optimize queries accordingly.

About Shiv Iyer 452 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.