Implementing the Materialized Path Model in PostgreSQL: A Step-by-Step Guide

The Materialized Path model is a strategy for representing hierarchical data within a relational database system like PostgreSQL. It involves storing the entire path to a node within a tree as a single column value in each row. This method is particularly effective for representing data with a hierarchical structure, such as categories, organizational structures, or file systems, within a flat table. The key advantage of the Materialized Path model is its simplicity and efficiency in querying hierarchical data with a single query, especially for retrieving ancestors or descendants of a node.

Implementation Steps

Step 1: Designing the Table

First, create a table to hold the hierarchical data. For this example, let’s model a simple category structure for products:


CREATE TABLE categories (
    category_id SERIAL PRIMARY KEY,
    category_name VARCHAR(255) NOT NULL,
    path TEXT NOT NULL,
    CONSTRAINT path_unique UNIQUE (path)
);

Here, path stores the materialized path for each category. A common convention is to use a delimiter, such as ‘/’, to separate IDs in the path string.

Step 2: Inserting Data

Insert data into the table, constructing the path value to represent each category’s position in the hierarchy. For a root category, the path might simply be its own category_id, while subcategories would append their category_id to their parent’s path.


-- Insert a root category
INSERT INTO categories (category_name, path) VALUES ('Electronics', '1/');

-- Insert a subcategory under 'Electronics'
INSERT INTO categories (category_name, path) VALUES ('Laptops', '1/2/');

Step 3: Querying the Data

  • Fetching All Ancestors of a Node

To get all ancestors of a category, you can use the LIKE operator:


SELECT * FROM categories WHERE '1/2/' LIKE path || '%';
  • Fetching All Descendants of a Node

To get all descendants (children, grandchildren, etc.) of a category:


SELECT * FROM categories WHERE path LIKE '1/%';

Example: Representing a Simple Product Hierarchy

Consider a hierarchy where “Electronics” is a root category, with “Laptops” and “Smartphones” as subcategories. Under “Laptops”, there might be “Gaming” and “Ultrabooks” as further subcategories.


-- Root category
INSERT INTO categories (category_name, path) VALUES ('Electronics', '1/');

-- Direct child categories
INSERT INTO categories (category_name, path) VALUES ('Laptops', '1/2/');
INSERT INTO categories (category_name, path) VALUES ('Smartphones', '1/3/');

-- Subcategories under 'Laptops'
INSERT INTO categories (category_name, path) VALUES ('Gaming', '1/2/4/');
INSERT INTO categories (category_name, path) VALUES ('Ultrabooks', '1/2/5/');

Benefits and Considerations

  • Performance: The Materialized Path model allows for efficient reads, as both descendants and ancestors can be fetched with simple queries. However, write operations, particularly updates to the hierarchy, can be more complex and may require updating the paths of all descendants.
  • Indexing: Creating indexes on the path column can significantly improve query performance, especially for tables with large hierarchies.
  • Handling Depth: While fetching descendants and ancestors is straightforward, calculating the depth of each node directly from the path can be challenging and may require additional logic or stored procedures.

The Materialized Path model offers a straightforward and efficient method for managing hierarchical data in PostgreSQL, making it a suitable choice for various applications requiring hierarchical data representation.

How Materialised Views work in PostgreSQL?

Understanding the Internal Locking Hierarchy and Mechanisms in PostgreSQL

How to implement optimally recursive queries and hierarchical data in PostgreSQL?

How Many Access Paths are Available to the PostgreSQL Optimizer?

How Many Access Paths are Available to the PostgreSQL Optimizer?

 

About Shiv Iyer 499 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.