How checkpointing is implemented in MySQL and PostgreSQL?

How is checkpointing implemented in MySQL and PostgreSQL?


Checkpointing is a process that is used to periodically write dirty pages from the buffer pool to disk in order to free up memory and reduce the recovery time in case of a crash. The implementation of checkpointing in MySQL and PostgreSQL is slightly different.
 
In MySQL, checkpointing is implemented as part of the InnoDB storage engine. The InnoDB storage engine uses a background thread called the “master thread” to periodically perform a full checkpoint. The full checkpoint includes flushing all dirty pages from the buffer pool to disk, and updating the doublewrite buffer and the change buffer. InnoDB also uses a background thread called the “checkpointer” thread to perform incremental checkpointing, which flushes only the dirty pages that have not been flushed for the longest time.
 
Here is an example of a Python script that uses the MySQL Connector library to monitor the checkpointing process in MySQL:
 
This script connects to the MySQL server, retrieves the output of the SHOW ENGINE INNODB STATUS query, and searches for the “Checkpoint age” in the output. If found, it extracts the checkpoint age and prints it. The checkpoint age is the time in seconds since the last checkpoint, therefore the lower the better.

 

You can schedule the script to run in certain intervals using a scheduler such as crontab, and when the checkpoint age is too high you can take action such as increasing the innodb_checkpoint_age_target in the configuration file, or increase the amount of RAM and CPU resources in the machine.

 

Note that this script is for illustrative purposes only and should be adapted to your specific MySQL configuration and needs.

 

 
In PostgreSQL, checkpointing is implemented as part of the PostgreSQL database engine. The database engine uses a background process called the “checkpointer” to periodically write dirty pages from the buffer pool to disk. PostgreSQL uses a technique called “time-based checkpointing” where the checkpoint process is triggered based on the time that has passed since the last checkpoint, rather than the number of dirty pages in the buffer pool. PostgreSQL also uses another technique called “request-based checkpointing” where the checkpoint process is triggered by a request from the database administrator or when the system runs out of free memory.
 
Here is an example of a Python script that uses the psycopg2 library to monitor the checkpointing process in PostgreSQL:
 
This script connects to the PostgreSQL server and queries the pg_current_wal_flush_time
 
function. This function returns the last time when the WAL (Write-Ahead Log) was flushed to disk, in other words, it returns the last time a checkpoint was made.
 
The query uses the age() function to calculate the time passed since the last checkpoint, and returns the result in seconds.
 
The script prints the checkpoint age and you can schedule the script to run in certain intervals using a scheduler such as crontab, and when the checkpoint age is too high you can take action such as increasing the checkpoint_completion_target or max_wal_size in the configuration file, or increase the amount of RAM and CPU resources in the machine.
 
Note that this script is for illustrative purposes only and should be adapted to your specific PostgreSQL configuration and needs.

 

Conclusion

In summary, checkpointing in MySQL is implemented by InnoDB storage engine, by two background threads, one for full checkpoint and the other for incremental checkpointing. While in PostgreSQL it is implemented by the database engine, with two techniques, time-based and request-based checkpointing.

About MinervaDB Corporation 36 Articles
A boutique private-label enterprise-class MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse consulting, 24*7 consultative support and remote DBA services company with core expertise in performance, scalability and high availability. Our consultants have several years of experience in architecting and building web-scale database infrastructure operations for internet properties from diversified verticals like CDN, Mobile Advertising Networks, E-Commerce, Social Media Applications, SaaS, Gaming and Digital Payment Solutions. Our globally distributed team working on multiple timezones guarantee 24*7 Consulting, Support and Remote DBA Services delivery for MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse.