Understanding PostgreSQL LATERAL JOINs

Understanding PostgreSQL LATERAL JOINs: A Deep Dive



Introduction

PostgreSQL’s LATERAL JOIN is a powerful feature that extends the capabilities of standard SQL joins. While traditional joins evaluate all rows independently, LATERAL joins allow subqueries to reference columns from preceding tables in the FROM clause, enabling row-by-row evaluation similar to correlated subqueries but with more flexibility and cleaner syntax.1

In this blog post, we’ll explore what LATERAL joins are, when to use them, and how they can simplify complex queries.

What is a LATERAL JOIN?

A LATERAL JOIN allows a subquery in the FROM clause to reference columns from tables that appear earlier in the same FROM clause. This creates a dependency where the subquery is evaluated once for each row of the preceding table, much like a correlated subquery or a for-each loop.2

Key Characteristics

  • Row-by-row evaluation: The subquery executes separately for each row from the preceding table
  • Column references: Can access columns from tables listed before it in the FROM clause
  • Flexibility: More versatile than correlated subqueries in many scenarios24

Basic Syntax

SELECT *
FROM table1
LEFT JOIN LATERAL (
    SELECT *
    FROM table2
    WHERE table2.foreign_key = table1.primary_key
    ORDER BY some_column
    LIMIT 5
) AS subquery ON true;

Practical Examples

Example 1: Top N Records Per Group

One of the most common use cases for LATERAL joins is retrieving the top N records for each group.

Scenario: You have an orders table and want to find the 3 most recent orders for each customer.

-- Sample tables
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100)
);

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    amount DECIMAL(10,2)
);

-- Query using LATERAL JOIN
SELECT 
    c.customer_name,
    o.order_id,
    o.order_date,
    o.amount
FROM customers c
LEFT JOIN LATERAL (
    SELECT order_id, order_date, amount
    FROM orders
    WHERE orders.customer_id = c.customer_id
    ORDER BY order_date DESC
    LIMIT 3
) o ON true
ORDER BY c.customer_name, o.order_date DESC;

This query efficiently retrieves the 3 most recent orders for each customer without requiring complex window functions or self-joins.

Example 2: Reusing Calculations

LATERAL joins excel at making queries more readable by allowing you to reuse calculated values.3

-- Without LATERAL (repetitive calculations)
SELECT 
    product_id,
    price * quantity AS total_value,
    (price * quantity) * 0.1 AS tax,
    (price * quantity) + ((price * quantity) * 0.1) AS final_price
FROM order_items;

-- With LATERAL (clean and reusable)
SELECT 
    product_id,
    calc.total_value,
    calc.tax,
    calc.final_price
FROM order_items
CROSS JOIN LATERAL (
    SELECT 
        price * quantity AS total_value,
        (price * quantity) * 0.1 AS tax,
        (price * quantity) + ((price * quantity) * 0.1) AS final_price
) calc;

Example 3: Complex Filtering with Dependencies

Scenario: Find employees and their most recent performance review that meets certain criteria.

SELECT 
    e.employee_id,
    e.employee_name,
    e.department,
    r.review_date,
    r.rating,
    r.comments
FROM employees e
LEFT JOIN LATERAL (
    SELECT 
        review_date,
        rating,
        comments
    FROM performance_reviews pr
    WHERE pr.employee_id = e.employee_id
        AND pr.rating >= 4
        AND pr.review_date >= e.hire_date + INTERVAL '6 months'
    ORDER BY review_date DESC
    LIMIT 1
) r ON true
WHERE e.status = 'active';

When to Use LATERAL JOINs

Ideal Use Cases

  1. Top-N queries per group: Retrieving a limited number of records for each parent record
  2. Complex calculations: When you need to reference calculated columns in subsequent calculations3
  3. Dependent subqueries: When a subquery needs to reference columns from the outer query
  4. Set-returning functions: When working with functions that return multiple rows

Performance Considerations

While LATERAL joins are powerful, they come with performance trade-offs:

  • Per-row evaluation: The subquery executes once for each row, which can be slower than set-based operations5
  • Index dependency: Performance heavily depends on proper indexing of the referenced columns6
  • Simple cases: For basic scenarios, correlated subqueries might be slightly faster4

Best practices:

  • Ensure proper indexes on join columns
  • Use LIMIT clauses when appropriate to reduce result sets
  • Consider alternatives like window functions for simple ranking scenarios
  • Test query plans with EXPLAIN ANALYZE to verify performance

LATERAL vs. Correlated Subqueries

LATERAL joins and correlated subqueries are similar but have key differences:

-- Correlated subquery
SELECT 
    c.customer_name,
    (SELECT COUNT(*) 
     FROM orders o 
     WHERE o.customer_id = c.customer_id) AS order_count
FROM customers c;

-- LATERAL JOIN (more flexible)
SELECT 
    c.customer_name,
    o.order_count,
    o.total_amount
FROM customers c
LEFT JOIN LATERAL (
    SELECT 
        COUNT(*) AS order_count,
        SUM(amount) AS total_amount
    FROM orders
    WHERE orders.customer_id = c.customer_id
) o ON true;

LATERAL joins allow you to return multiple columns and rows, making them more versatile than scalar correlated subqueries.2

Conclusion

PostgreSQL’s LATERAL JOIN is a valuable tool for handling complex query scenarios that require row-by-row evaluation. It shines in situations where you need to:

  • Retrieve top N records per group
  • Reference calculated values in subsequent calculations
  • Create more readable and maintainable queries

While LATERAL joins can impact performance due to their per-row evaluation nature, proper indexing and thoughtful query design can mitigate these concerns. Understanding when and how to use LATERAL joins will expand your PostgreSQL query toolkit and help you write more elegant solutions to complex data retrieval problems.31

Key Takeaway: LATERAL joins bridge the gap between the flexibility of procedural programming and the power of set-based SQL operations, offering a cleaner syntax for scenarios that would otherwise require complex subqueries or multiple query executions.


References

  1. Lateral joins – what are your thoughts? : r/SQL
  2. sql – Postgresql LATERAL vs INNER JOIN – Stack

Further Reading

  1. Next-Gen Data Management
  2. Understanding Database Locking
  3. Tuning Linux Dirty Data Parameters for Vertica
  4. Back Up and Restore a Set of Collections in MongoDB Atlas
  5. Indexing Materialized Views in PostgreSQL

About MinervaDB Corporation 179 Articles
Full-stack Database Infrastructure Architecture, Engineering and Operations Consultative Support(24*7) Provider for PostgreSQL, MySQL, MariaDB, MongoDB, ClickHouse, Trino, SQL Server, Cassandra, CockroachDB, Yugabyte, Couchbase, Redis, Valkey, NoSQL, NewSQL, Databricks, Amazon Resdhift, Amazon Aurora, CloudSQL, Snowflake and AzureSQL with core expertize in Performance, Scalability, High Availability, Database Reliability Engineering, Database Upgrades/Migration, and Data Security.