Adaptive Joins in PostgreSQL 16 (mimicked feature)

Introduction

PostgreSQL does not natively support "Adaptive Hash Joins" as a built-in feature. In PostgreSQL, the query planner chooses join methods such as nested loop, hash join, and merge join based on the statistics and cost estimates available when the query plan is generated. While PostgreSQL's planner is quite advanced in selecting the most efficient join method based on the available data, it does not adaptively change the join strategy during query execution based on real-time data distribution or size, as the concept of an "Adaptive Hash Join" might suggest.

For more advanced join behaviours or to mimic adaptive join strategies, users often resort to custom solutions, like dynamic SQL in PL/pgSQL functions or application-side logic, to estimate data characteristics at runtime and choose the most appropriate join strategy.

Mimicking Adaptive Joins in PostgreSQL 16

To mimic adaptive joins in PostgreSQL 16, you can create a PL/pgSQL function that dynamically selects the join strategy based on the data size. Here’s a simplified example:

Suppose you have two tables, orders and customers, and you want to join them using either a nested loop or a hash join based on the number of rows.

1. Create a Function

2. Use the Function

Conclusion

In this example, the function adaptive_join first estimates the number of rows in each table. If either table has more than 1000 rows, it uses a regular JOIN (which might use a hash join depending on the planner's decision); otherwise, it uses a comma-separated JOIN syntax, which tends to favour nested loop joins. This approach is a rudimentary form of an adaptive join, where the decision is based on row count thresholds.

About Shiv Iyer 446 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.