Understanding Shared Buffers Implementation in PostgreSQL

In PostgreSQL, shared buffers serve as the shared memory pool used to cache data pages in memory to improve read performance. Whenever a query needs to access data, it checks if the required pages are already in the shared buffers. If they are, the data is returned from the buffer cache without any disk access, which significantly improves the query performance.

The shared buffers are implemented as a fixed-size array of memory pages, where each page is typically 8KB in size. When a new server process starts, it allocates a certain number of shared buffers in memory based on the shared_buffers configuration parameter. This parameter specifies the amount of memory that should be used for the shared buffer cache.

Whenever a query accesses a table or index, PostgreSQL first checks if the required pages are already in the shared buffers. If they are not, PostgreSQL reads the pages from disk and adds them to the shared buffer cache. If there is not enough space in the buffer cache, the least recently used pages are evicted from the buffer cache to make room for new pages.

To minimize the number of disk reads and writes, PostgreSQL uses a write-ahead logging (WAL) system to log all changes made to the database. The WAL records are written to disk sequentially, and changes are applied to the database asynchronously in the background. This ensures that even if the server crashes or loses power, it can recover to a consistent state by replaying the WAL records.

In addition to the shared buffer cache, PostgreSQL also has a separate cache for the execution plan of queries, called the “plan cache”. This cache is used to store the query plans generated by the query planner for faster execution in subsequent queries.

The plan cache is implemented using the pg_stat_plans view, which stores statistics about the execution of each query plan. Whenever a query is executed, PostgreSQL checks if a cached plan exists for the query. If a cached plan exists and is still valid, PostgreSQL reuses the plan instead of generating a new one, which can significantly improve query performance.Overall, the shared buffer cache and plan cache are critical components of PostgreSQL’s performance optimization. By caching frequently accessed data pages and query plans in memory, PostgreSQL can avoid costly disk I/O and provide faster query performance.

About Shiv Iyer 446 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.