How is Cache Eviction implemented in InnoDB?

Understanding InnoDB Buffer Pool Management

InnoDB implements cache eviction using an algorithm similar to the Least Recently Used (LRU) algorithm. However, it’s a bit more complex than a simple LRU due to a feature called the “midpoint insertion strategy,” which aims to prevent large, infrequently accessed data sets from flushing out smaller, but frequently accessed, data sets from the buffer pool.

The InnoDB Buffer Pool is divided into two segments, the “old” segment and the “new” segment. This division is controlled by the innodb_old_blocks_pct configuration parameter, which defines the percentage of the total buffer pool to use for the old block list. The remaining part of the buffer pool is used for the new block list.

  1. New Pages: When a page is first loaded into the buffer pool, it is placed at the midpoint (which separates the old and new segments), not at the head which is the most recently used end. This means new pages need to be accessed again to be promoted to the “new” segment.
  2. Old Pages: If a page is loaded in the “old” segment and it is accessed again, it will be moved to the “new” segment. If it is not accessed again, it eventually falls off the end of the “old” list and is evicted.
  3. Midpoint Insertion Strategy: This strategy allows InnoDB to manage the “new” list for very frequently accessed pages, and the “old” list for less frequently accessed pages. The idea is to prevent a large scan (which touches a lot of pages only once) from completely flushing out the buffer pool with pages that won’t be needed again soon.
  4. Configuration Parameters for InnoDB Buffer Pool:

InnoDB provides several configuration parameters that control the behaviour of the buffer pool:

  • innodb_buffer_pool_size: This is the size of the memory buffer InnoDB uses to cache data and indexes of its tables. The larger you set this value, the less disk I/O is needed to access data in tables. On a dedicated database server, you might set this to up to 80% of the machine’s physical memory size.
  • innodb_buffer_pool_instances: For systems with buffer pools in the multi-gigabyte range, dividing the buffer pool into separate instances can improve concurrency, by reducing contention as different threads read and write to cached pages. Each buffer pool manages its own free lists, flush lists, LRUs, and all other data structures connected with it, and is protected by its own buffer pool mutex.
  • innodb_old_blocks_pct: This parameter, as mentioned earlier, determines the percentage of the buffer pool to be reserved for the old block list.
  • innodb_old_blocks_time: This parameter sets the time period in milliseconds that a block inserted into the old sublist must stay there after its first access before it can be moved to the new sublist.

Monitoring and Tuning the Buffer Pool

Monitoring the status of the buffer pool can provide valuable insight into the efficiency of the buffer pool and the database as a whole. You can use the SHOW ENGINE INNODB STATUS command to get statistics about the buffer pool, including the number of pages currently dirty and the rate at which pages are being made dirty and cleaned.

Based on these statistics, you might decide to tune the configuration parameters mentioned above. For example, if you see a high rate of page evictions (meaning that pages are frequently being loaded into the buffer pool only to be evicted shortly thereafter), you might decide to increase the size of the buffer pool, if your system’s RAM capacity allows it.

Remember, InnoDB’s buffer pool management is designed to optimize database performance by minimizing disk I/O, but the optimal configuration can depend heavily on the specific workload and hardware, so it’s important to monitor performance and adjust settings as necessary. So, in essence, InnoDB uses a tweaked LRU algorithm for its cache eviction policy. This approach is designed to optimize the balance between frequently accessed pages and less frequently accessed pages, leading to more efficient use of the buffer pool and overall better performance.

About Shiv Iyer 485 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.