PostgreSQL Locks Explained – Understanding PostgreSQL MVCC

PostgreSQL provides four transaction isolation levels, which control the visibility of changes made by concurrent transactions.

  1. Read Uncommitted: This isolation level allows dirty reads, which means a transaction can read data modified by another concurrent transaction that has not yet been committed. This can result in inconsistent data, and it is not recommended to use this isolation level.
  2. Read Committed: This isolation level allows only committed data to be read, which means a transaction will not see changes made by another concurrent transaction until they have been committed. This is the default isolation level in PostgreSQL.
  3. Repeatable Read: This isolation level provides a guarantee that the same query executed twice within a transaction will return the same results, even if another concurrent transaction modifies the data between the two queries.
  4. Serializable: This isolation level provides the strongest isolation guarantee, which ensures that the execution of concurrent transactions is equivalent to some serial order of their execution.

Example:

Suppose two transactions are running concurrently. Transaction A updates a record in a table, and transaction B queries the same record. If transaction B is running under the Read Uncommitted isolation level, it will see the changes made by transaction A even though they have not yet been committed. If transaction B is running under the Read Committed isolation level, it will not see the changes made by transaction A until they are committed. If transaction B is running under the Repeatable Read isolation level, it will see the same data as it did at the beginning of the transaction, even if transaction A modifies the data in the meantime. If transaction B is running under the Serializable isolation level, it will be blocked until transaction A commits or rolls back.

PostgreSQL Locks:

PostgreSQL uses various types of locks to ensure data consistency in concurrent environments. Locks can influence performance by blocking other transactions that need access to the same data.

There are two types of locks in PostgreSQL: row-level locks and table-level locks. Row-level locks are used to lock individual rows, while table-level locks are used to lock entire tables.

Monitoring PostgreSQL Locks:

PostgreSQL provides a view called pg_locks that can be used to monitor locks. The pg_locks view shows all locks currently held by all transactions. By analyzing the locks held by a transaction, it is possible to identify performance bottlenecks caused by locking.

Here’s an example query that can be used to monitor locks:

SELECT 
pg_stat_activity.pid AS pid, 
pg_stat_activity.query AS query, 
pg_locks.mode AS lock_mode, 
pg_locks.relation::regclass AS table, 
pg_stat_activity.state AS state, 
now() – pg_stat_activity.query_start AS query_duration 
FROM 
pg_stat_activity, 
pg_locks 
WHERE 
pg_stat_activity.pid = pg_locks.pid 
AND pg_locks.granted = true 
AND pg_locks.mode != ‘t’ 
AND pg_stat_activity.pid <> pg_backend_pid() 
ORDER BY 
query_duration DESC;

This query retrieves the process ID (PID), query, lock mode, table name, state, and query duration for all transactions currently holding locks, except for the current transaction. The results can be used to identify transactions that are blocking other transactions and causing performance bottlenecks.

Real-Time Monitoring of PostgreSQL Locks (implemented in Python) 

import psycopg2
import time
from datetime import datetime
def monitor_locks(conn):
cur = conn.cursor()
cur.execute(“””
SELECT 
pg_stat_activity.pid, 
pg_stat_activity.query, 
pg_stat_activity.state, 
pg_locks.mode, 
pg_locks.relation::regclass AS table_name,
now() – pg_stat_activity.query_start AS duration
FROM 
pg_stat_activity, 
pg_locks
WHERE 
pg_stat_activity.pid = pg_locks.pid
AND pg_locks.granted
AND pg_stat_activity.pid <> pg_backend_pid()
AND pg_locks.mode = ‘ExclusiveLock’
ORDER BY 
duration DESC;
“””)
rows = cur.fetchall()
cur.close()
data = []
for row in rows:
pid, query, state, lock_mode, table_name, duration = row
duration_seconds = duration.total_seconds()
query = query[:80] + (query[80:] and ‘…’)
data.append((pid, query, state, lock_mode, table_name, duration_seconds))
return data
def print_locks(data):
if len(data) == 0:
print(“No locks found.”)
return
print(“PID\tQuery\t\t\t\t\t\t\t\tState\t\tLock Mode\tTable Name\t\t\tDuration”)
print(“———————————————————————————————————————————————————————-“)
for row in data:
pid, query, state, lock_mode, table_name, duration_seconds = row
query = query[:80] + (query[80:] and ‘…’)
print(f”{pid}\t{query}\t{state}\t{lock_mode}\t{table_name}\t{duration_seconds:.2f}”)
if __name__ == “__main__”:
conn = psycopg2.connect(
host=”localhost”,
database=”mydatabase”,
user=”myusername”,
password=”mypassword”
)
try:
while True:
data = monitor_locks(conn)
print_locks(data)
time.sleep(1)
except KeyboardInterrupt:
pass
conn.close()

This script connects to a PostgreSQL database and monitors for exclusive locks held by other transactions. It retrieves the PID, query, state, lock mode, table name, and duration of the locks held, and prints the results in a tabular format.

The monitor_locks function executes a SQL query that joins the pg_stat_activity and pg_locks views to retrieve the necessary information about locks. The print_locks function formats and prints the lock information in a tabular format.

The time.sleep(1) statement causes the script to pause for 1 second before retrieving and printing the lock information again. This creates a real-time monitoring effect.

The script can be modified to connect to a specific PostgreSQL instance and database by changing the connection parameters in the psycopg2.connect statement.

About Shiv Iyer 157 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.

Be the first to comment

Leave a Reply