Adaptive Joins in PostgreSQL 16: Exploring the Mimicked Feature

Table of Contents

Introduction

PostgreSQL does not natively offer "Adaptive Hash Joins" as a built-in feature for query execution.
Instead, the query planner selects join methods, such as nested loop, hash join, or merge join, during plan generation.
Additionally, it bases these selections on statistics and cost estimates available at the query planning stage.
Although PostgreSQL's planner efficiently selects join methods, it does not change join strategies adaptively during query execution. Consequently, it cannot adjust to real-time data distribution or size changes, as "Adaptive Hash Joins" might imply.

For more advanced join behaviours or to mimic adaptive join strategies, users often resort to custom solutions, like dynamic SQL in PL/pgSQL functions or application-side logic, to estimate data characteristics at runtime and choose the most appropriate join strategy.

Mimicking Adaptive Joins in PostgreSQL 16

To mimic adaptive joins in PostgreSQL 16, you can create a PL/pgSQL function that dynamically selects the join strategy based on the data size. Here’s a simplified example:

Suppose you have two tables, orders and customers, and you want to join them using either a nested loop or a hash join based on the number of rows.

1. Create a Function

CREATE OR REPLACE FUNCTION adaptive_join() RETURNS SETOF RECORD AS $
DECLARE
    row_count_orders INTEGER;
    row_count_customers INTEGER;
    query TEXT;
BEGIN
    -- Estimate row counts
    SELECT count(*) INTO row_count_orders FROM orders;
    SELECT count(*) INTO row_count_customers FROM customers;

    -- Choose join strategy based on row count
    IF row_count_orders > 1000 OR row_count_customers > 1000 THEN
        query := 'SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id';
    ELSE
        query := 'SELECT * FROM orders, customers WHERE orders.customer_id = customers.id';
    END IF;

    -- Execute the chosen query
    RETURN QUERY EXECUTE query;
END;
$ LANGUAGE plpgsql;

2. Use the Function

SELECT * FROM adaptive_join() AS t(column_definitions);

Conclusion

In this example, the function adaptive_join first estimates the number of rows in each table. If either table has more than 1000 rows, it uses a regular JOIN (which might use a hash join depending on the planner's decision); otherwise, it uses a comma-separated JOIN syntax, which tends to favour nested loop joins. This approach is a rudimentary form of an adaptive join, where the decision is based on row count thresholds.

The WebScale Database Infrastructure Architecture, Engineering and Operations Company

Full-Stack Database Engineering & Cloud DBaaS Solutions for PostgreSQL, MySQL, MongoDB & More | Performance, Scalability, High Availability, Security & Analytics Experts

Adaptive Joins in PostgreSQL 16 (mimicked feature)

Introduction

Mimicking Adaptive Joins in PostgreSQL 16

1. Create a Function

2. Use the Function

Conclusion

Introduction

Mimicking Adaptive Joins in PostgreSQL 16

1. Create a Function

2. Use the Function

Conclusion

Related Articles

Optimizing PostgreSQL Capacity Planning and Sizing with Default Value Settings

Understanding PostgreSQL I/O Details

“PostgreSQL rocks, except when it blocks”: Understanding Locks in PostgreSQL