PostgreSQL Locks Explained – Understanding PostgreSQL MVCC

PostgreSQL provides three transaction isolation levels, which control the visibility of changes made by concurrent transactions.

  • Read Committed: This isolation level allows only committed data to be read, which means a transaction will not see changes made by another concurrent transaction until they have been committed. This is the default isolation level in PostgreSQL.
  • Repeatable Read: This isolation level provides a guarantee that the same query executed twice within a transaction will return the same results, even if another concurrent transaction modifies the data between the two queries.
  • Serializable: This isolation level provides the strongest isolation guarantee, which ensures that the execution of concurrent transactions is equivalent to some serial order of their execution.

PostgreSQL Locks:

PostgreSQL uses various types of locks to ensure data consistency in concurrent environments. Locks can influence performance by blocking other transactions that need access to the same data.

There are two types of locks in PostgreSQL: row-level locks and table-level locks. Row-level locks are used to lock individual rows, while table-level locks are used to lock entire tables.

Monitoring PostgreSQL Locks:

PostgreSQL provides a view called pg_locks that can be used to monitor locks. The pg_locks view shows all locks currently held by all transactions. By analyzing the locks held by a transaction, it is possible to identify performance bottlenecks caused by locking.

Here’s an example query that can be used to monitor locks:

SELECT 
pg_stat_activity.pid AS pid, 
pg_stat_activity.query AS query, 
pg_locks.mode AS lock_mode, 
pg_locks.relation::regclass AS table, 
pg_stat_activity.state AS state, 
now() – pg_stat_activity.query_start AS query_duration 
FROM 
pg_stat_activity, 
pg_locks 
WHERE 
pg_stat_activity.pid = pg_locks.pid 
AND pg_locks.granted = true 
AND pg_locks.mode != ‘t’ 
AND pg_stat_activity.pid <> pg_backend_pid() 
ORDER BY 
query_duration DESC;

This query retrieves the process ID (PID), query, lock mode, table name, state, and query duration for all transactions currently holding locks, except for the current transaction. The results can be used to identify transactions that are blocking other transactions and causing performance bottlenecks.

Real-Time Monitoring of PostgreSQL Locks (implemented in Python) 

import psycopg2
import time
from datetime import datetime
def monitor_locks(conn):
cur = conn.cursor()
cur.execute(“””
SELECT 
pg_stat_activity.pid, 
pg_stat_activity.query, 
pg_stat_activity.state, 
pg_locks.mode, 
pg_locks.relation::regclass AS table_name,
now() – pg_stat_activity.query_start AS duration
FROM 
pg_stat_activity, 
pg_locks
WHERE 
pg_stat_activity.pid = pg_locks.pid
AND pg_locks.granted
AND pg_stat_activity.pid <> pg_backend_pid()
AND pg_locks.mode = ‘ExclusiveLock’
ORDER BY 
duration DESC;
“””)
rows = cur.fetchall()
cur.close()
data = []
for row in rows:
pid, query, state, lock_mode, table_name, duration = row
duration_seconds = duration.total_seconds()
query = query[:80] + (query[80:] and ‘…’)
data.append((pid, query, state, lock_mode, table_name, duration_seconds))
return data
def print_locks(data):
if len(data) == 0:
print(“No locks found.”)
return
print(“PID\tQuery\t\t\t\t\t\t\t\tState\t\tLock Mode\tTable Name\t\t\tDuration”)
print(“———————————————————————————————————————————————————————-“)
for row in data:
pid, query, state, lock_mode, table_name, duration_seconds = row
query = query[:80] + (query[80:] and ‘…’)
print(f”{pid}\t{query}\t{state}\t{lock_mode}\t{table_name}\t{duration_seconds:.2f}”)
if __name__ == “__main__”:
conn = psycopg2.connect(
host=”localhost”,
database=”mydatabase”,
user=”myusername”,
password=”mypassword”
)
try:
while True:
data = monitor_locks(conn)
print_locks(data)
time.sleep(1)
except KeyboardInterrupt:
pass
conn.close()

This script connects to a PostgreSQL database and monitors for exclusive locks held by other transactions. It retrieves the PID, query, state, lock mode, table name, and duration of the locks held, and prints the results in a tabular format.

The monitor_locks function executes a SQL query that joins the pg_stat_activity and pg_locks views to retrieve the necessary information about locks. The print_locks function formats and prints the lock information in a tabular format.

The time.sleep(1) statement causes the script to pause for 1 second before retrieving and printing the lock information again. This creates a real-time monitoring effect.

The script can be modified to connect to a specific PostgreSQL instance and database by changing the connection parameters in the psycopg2.connect statement.

About Shiv Iyer 422 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.