Monitoring PostgreSQL Disk I/O Performance

PostgreSQL Performance, MinervaDB, PostgreSQL, PostgreSQL DBA

Table of Contents

Monitoring PostgreSQL Disk I/O Performance

PostgreSQL performance depends heavily on the I/O Subsystem (IOS). Latency in the IOS can result in many performance problems. For example, you may experience slow response times and problems caused by tasks timing out. It is critical that you monitor your disk usage. Disk Transfers/sec is comprised of both Disk Reads/sec and Disk Writes/sec. You can use these counters to determine if the drive does not have enough supporting disks. When you use these counters you might need to adjust the values for the type of RAID you have implemented:

Raid 0 — I/Os per disk = (reads + writes) / number of disks
Raid 1 — I/Os per disk = [reads + (2 * writes)] / 2
Raid 5 — I/Os per disk = [reads + (4 * writes)] / number of disks
Raid 10 — I/Os per disk = [reads + (2 * writes)] / number of disks

Disk counters that you can monitor to determine disk activity are divided into the following two groups:

Primary
- PhysicalDisk: Avg. Disk sec/Write
- PhysicalDisk: Avg. Disk sec/Read
Secondary
- PhysicalDisk: Avg. Disk Queue Length
- PhysicalDisk: Disk Bytes/sec
- PhysicalDisk: Disk Transfers/sec

Monitoring Disk I/O performance in Linux

Linux Disk I/O Tool	Description
dtstat	To monitor disk I/O activity and statistics. Usage: dstat -d, dstat -d 10
iostat	Disk I/O statistics to evaluate the ratio of READ / WRITE activity between disks. Usage: iostat, iostat -c (CPU usage statistics), iostat -d , iostat -m (disk I/O statistics in MB), iostat -x -d 30 (look for await column)
iotop	To list / report disk I/O intensive processes. Usage: iotop -d 5 , iotop -o

Monitoring PostgreSQL Disk I/O performance

-- perform a "select pg_stat_reset();" when you want to reset counter statistics
with 
all_tables as
(
SELECT  *
FROM    (
    SELECT  'all'::text as table_name, 
        sum( (coalesce(heap_blks_read,0) + coalesce(idx_blks_read,0) + coalesce(toast_blks_read,0) + coalesce(tidx_blks_read,0)) ) as from_disk, 
        sum( (coalesce(heap_blks_hit,0)  + coalesce(idx_blks_hit,0)  + coalesce(toast_blks_hit,0)  + coalesce(tidx_blks_hit,0))  ) as from_cache    
    FROM    pg_statio_all_tables  --> change to pg_statio_USER_tables if you want to check only user tables (excluding postgres's own tables)
    ) a
WHERE   (from_disk + from_cache) > 0 -- discard tables without hits
),
tables as 
(
SELECT  *
FROM    (
    SELECT  relname as table_name, 
        ( (coalesce(heap_blks_read,0) + coalesce(idx_blks_read,0) + coalesce(toast_blks_read,0) + coalesce(tidx_blks_read,0)) ) as from_disk, 
        ( (coalesce(heap_blks_hit,0)  + coalesce(idx_blks_hit,0)  + coalesce(toast_blks_hit,0)  + coalesce(tidx_blks_hit,0))  ) as from_cache    
    FROM    pg_statio_all_tables --> change to pg_statio_USER_tables if you want to check only user tables (excluding postgres's own tables)
    ) a
WHERE   (from_disk + from_cache) > 0 -- discard tables without hits
)
SELECT  table_name as "table name",
    from_disk as "disk hits",
    round((from_disk::numeric / (from_disk + from_cache)::numeric)*100.0,2) as "% disk hits",
    round((from_cache::numeric / (from_disk + from_cache)::numeric)*100.0,2) as "% cache hits",
    (from_disk + from_cache) as "total hits"
FROM    (SELECT * FROM all_tables UNION ALL SELECT * FROM tables) a
ORDER   BY (case when table_name = 'all' then 0 else 1 end), from_disk desc

Monitoring PostgreSQL Table Size

SELECT *, pg_size_pretty(total_bytes) AS total
    , pg_size_pretty(index_bytes) AS index
    , pg_size_pretty(toast_bytes) AS toast
    , pg_size_pretty(table_bytes) AS table
  FROM (
  SELECT *, total_bytes-index_bytes-coalesce(toast_bytes,0) AS table_bytes FROM (
      SELECT c.oid,nspname AS table_schema, relname AS table_name
              , c.reltuples AS row_estimate
              , pg_total_relation_size(c.oid) AS total_bytes
              , pg_indexes_size(c.oid) AS index_bytes
              , pg_total_relation_size(reltoastrelid) AS toast_bytes
          FROM pg_class c
          LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
          WHERE relkind = 'r'
  ) a
) a;

Finding the largest databases in your PostgreSQL cluster

SELECT d.datname as Name,  pg_catalog.pg_get_userbyid(d.datdba) as Owner,
    CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
        THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname))
        ELSE 'No Access'
    END as Size
FROM pg_catalog.pg_database d
    order by
    CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
        THEN pg_catalog.pg_database_size(d.datname)
        ELSE NULL
    END desc -- nulls first
    LIMIT 20

Finding the total size of your biggest tables in your PostgreSQL Cluster

SELECT nspname || '.' || relname AS "relation",
    pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
    AND C.relkind <> 'i'
    AND nspname !~ '^pg_toast'
  ORDER BY pg_total_relation_size(C.oid) DESC
  LIMIT 20;

Conclusion

The following three questions can be used to help identify if there is a storage bottleneck and where it likely is:

Is there observed latency? (Avg Disk Sec/Read > 0.020 or Avg Disk Sec/Write > 0.020)
Is the latency observed on all (most) of the disks (LUNs) or only a single (few) disk (LUN).
- This question helps us understand if the problem is leaning towards a general lack of communication between the server and the storage or if the issue is more likely due to limitations of the physical spindles.
- If most of the disks are observed with latency at the same time it may indicate that the latency is due to a communication bottleneck, such as: an HBA, Switch, SAN Port, or SAN CPU.
- If there are many LUNs from the same storage device and only a single or few are observed with latency, the issue is likely due to the LUN.
Finally, compare the disk throughput (Disk Transfers/sec & Disk Bytes/sec) during the time that latency was observed to the time when the maximum throughput is observed.
- If latency always grows in proportion with the throughput, the issue may be with the physical spindles; though, this does not rule out the communication layer. Engage your storage administrator to identify if the physical spindles are capable of handling the throughput observed with Disk Transfers/sec and Disk Bytes/sec.
- If latency is found to be much lower when the activity is much higher than the bottleneck is likely not due to the physical spindles (JBOD). A storage administrator should be engaged to assist in reviewing the storage fabric (HBA, switches, SAN CPU, Ports, …).

References

https://wiki.postgresql.org/wiki/

About Shiv Iyer 502 Articles

Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.

* Everything changes over time – Our blogs/posts and comments changes over time, That’s how it should be! Whatever we comment from MinervaDB Inc. Teams (including Shiv Iyer) and other stakeholders or guest bloggers posted here are never permanent, These things worked for us. But, there is no guarantee they will work for you too, When using the recommendations from ChistaDATA or MinervaDB or MinervaSQL or any other online resources / Google, You must test the advice before applying them to your production systems, and always invest for a robust Database DR solution, Thank you for understanding.

Email – contact@minervadb.com

(We are online 24*7)

Google Hangouts – contact@minervadb.com

(for emergency support and quick response)

☛ Contact Shiv Iyer
▬▬▬▬▬▬▬▬▬▬▬▬▬
Email – shiv@minervadb.com
Google Hangouts – shiv@minervadb.com

☛ Shiv Iyer LinkedIn
▬▬▬▬▬▬▬▬▬▬▬▬▬

https://www.linkedin.com/in/thewebscaledba/

☛ Shiv Iyer GitHub
▬▬▬▬▬▬▬▬▬▬▬▬▬

https://github.com/shiviyer

Discounts are applicable only for multi-year contracts / long-term engagements, We don’t hire low-quality and cheap rookie consultants to manage your mission-critical Database Systems Infrastructure Operations and so our consulting rates are competitive. Being a virtual corporation (no physical offices anywhere in the world), whatever you pay go directly to our consultant’s fee. It’s impossible for us to offer you low-cost consulting, support and remote DBA services with elite-class team, Thanks for understanding and doing business with MinervaDB.

PostgreSQL is a registered trademark of the PostgreSQL Community Association. ClickHouse is a registered trademark of ClickHouse, Inc. MongoDB is a registered trademark of MongoDB, Inc. Couchbase is a registered trademark of Couchbase, Inc. Redis is a registered trademark of Redis Ltd. Apache Cassandra is a registered trademark of the Apache Software Foundation. Milvus is a registered trademark of Zilliz. MinIO is a registered trademark of MinIO, Inc. Amazon Redshift and Amazon Aurora are registered trademarks of Amazon.com, Inc. Google Cloud is a registered trademark of Google LLC. Snowflake is a registered trademark of Snowflake Inc. Databricks is a registered trademark of Databricks, Inc. MySQL and InnoDB are registered trademarks of Oracle Corporation. Oracle is a registered trademark of Oracle Corporation. MariaDB is a trademark of MariaDB Corporation Ab. All other trademarks are property of their respective owners. Other product or company names mentioned may be trademarks or trade names of their respective owner. Copyrights © 2010-2025. All Rights Reserved by MinervaDB®.