How is the transaction model implemented in PostgreSQL?

How is the transaction model implemented in PostgreSQL?


In PostgreSQL, transactions are implemented using the Multi-Version Concurrency Control (MVCC) model. This means that each transaction is given a snapshot of the database at the start of the transaction, and all the changes made by the transaction are only visible to other transactions after the transaction has been committed. This allows multiple transactions to run concurrently without interfering with each other. PostgreSQL also supports standard SQL-compliant transaction control statements like BEGIN, COMMIT, and ROLLBACK, as well as advanced features like savepoints and two-phase commit.

Transaction Isolation Levels – PostgreSQL 

PostgreSQL supports the following transaction isolation modes:

  1. Read Committed: This is the default isolation level in PostgreSQL. It guarantees that a transaction will only see data that has been committed by other transactions before the current transaction began.
  2. Repeatable Read: This isolation level ensures that a transaction will only see data that has been committed by other transactions before the current transaction began, and that any data read by the current transaction will not be changed by other transactions until the current transaction is committed.
  3. Serializable: This isolation level guarantees that a transaction will only see data that has been committed by other transactions before the current transaction began, and that any data read or modified by the current transaction will not be changed by other transactions until the current transaction is committed.
  4. Read Uncommitted: This isolation level is not recommended, as it allows a transaction to see data that has been modified by other transactions but not yet committed. This can lead to dirty reads and other inconsistencies.

PostgreSQL Transaction Isolation Levels explained with an example

The READ UNCOMMITTED isolation level in PostgreSQL allows a transaction to read rows that have been modified by other transactions, but not yet committed. This means that the transaction can read dirty data, which can lead to non-repeatable reads and phantom reads.

Here’s an example of how READ UNCOMMITTED isolation level can be used in PostgreSQL:

In this example, the transaction starts with the BEGIN statement, sets the isolation level to READ UNCOMMITTED, then selects all rows from the “mytable” table, and finally commits the transaction.

It is important to note that in PostgreSQL, the default isolation level is READ COMMITTED, and it is generally not recommended to use READ UNCOMMITTED in production environments as it can lead to data inconsistencies and other issues.

The READ COMMITTED isolation level in PostgreSQL ensures that a transaction can only read data that has been committed by other transactions. This means that a transaction cannot read data that is still in the process of being modified by other transactions.

Here’s an example of how the READ COMMITTED isolation level works:

  1. Transaction 1 begins and updates the value of a certain row in a table.
  2. Transaction 2 begins and wants to read the value of the same row, but the value has not been committed by Transaction 1 yet. Therefore, Transaction 2 is blocked until Transaction 1 commits its changes.
  3. Transaction 1 commits its changes, and Transaction 2 is able to read the updated value of the row.
  4. Transaction 2 continues with its operations, reading the values that have been committed by other transactions.

In this example, Transaction 2 only reads data that is already committed, ensuring data consistency and preventing dirty reads.

You can set the isolation level for a session using the command SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL READ COMMITTED;

The REPEATABLE READ isolation level in PostgreSQL ensures that all data read by a transaction is consistent with a single point in time, and that any data written by other transactions is not visible to the current transaction. This is achieved by acquiring shared locks on all rows read by the transaction.

Here is an example of using the Repeatable Read isolation level in PostgreSQL:

In this example, the first SELECT statement acquires shared locks on all rows in the “mytable” table, ensuring that the data read is consistent with a single point in time. The UPDATE statement then modifies one of the rows, but the changes are not visible to the second SELECT statement because it is still in the same transaction and the shared locks prevent other transactions from modifying the data. The COMMIT statement releases the shared locks.

The SERIALIZABLE isolation level in PostgreSQL is the highest level of isolation and guarantees that a transaction will see a snapshot of the database as it existed at the start of the transaction. This means that a transaction will not be able to see any changes made by other transactions until it has committed. The following is an example of how to set the isolation level to Serializable in PostgreSQL:

In this example, we start a new transaction using the BEGIN statement, set the isolation level to Serializable using the SET TRANSACTION ISOLATION LEVEL SERIALIZABLE statement, perform some queries, and then commit the transaction using the COMMIT statement.

It is important to note that the Serializable isolation level can cause performance issues, such as deadlocks, as it prevents other transactions from making changes to the same data until the current transaction has committed. It is recommended to use this isolation level only when it is necessary for the application’s requirements.

Conclusion

Understanding PostgreSQL’s Multi-Version Concurrency Control (MVCC) is important because it is a key feature that enables PostgreSQL to provide high levels of concurrency and consistency. It allows multiple transactions to access the same data simultaneously without causing conflicts. In MVCC, each transaction sees a snapshot of the data as it existed at the start of the transaction, and any changes made by the transaction are only visible to that transaction until it is committed. This allows for high concurrency and consistency, as well as the ability to rollback or recover from errors. It also allows for the implementation of isolation levels such as READ COMMITTED and SERIALIZABLE, which are essential for ensuring data consistency in a concurrent environment. Understanding MVCC is important for developers, database administrators, and anyone working with PostgreSQL to ensure that their transactions are running correctly and efficiently.

About Shiv Iyer 466 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.