pg_stat_statements deep dive
How much of your travel PostgreSQL engineering budget is silently being spent on the consequences of a misunderstood pg_stat_statements configuration? The most useful observability metrics are the ones nobody puts on a dashboard until after the first time they were missing during an incident. Wait events, replication lag percentiles, autovacuum progress, lock graphs, slow-log digests — these are the metrics that matter at 03:00, and they are usually missing the first time a senior engineer goes looking for them.
The most useful observability metrics are the ones nobody puts on a dashboard until after the first time they were missing during an incident. Wait events, replication lag percentiles, autovacuum progress, lock graphs, slow-log digests — these are the metrics that matter at 03:00, and they are usually missing the first time a senior engineer goes looking for them. An alert that does not include a runbook link is not an alert; it is a notification of obligation. Every alert should answer two questions in its body: what is wrong, and what does the on-call engineer do next. Alerts that fail this test get tuned out within weeks, which is worse than not having the alert at all.
What the textbooks and blog posts will tell you
Wait events are the right primitive for diagnosing a busy PostgreSQL system. pg_stat_activity exposes wait_event_type and wait_event for every active backend, sampled cheaply enough that you can poll once per second without measurable overhead. Aggregating that sample over time gives you a flame graph of where the database actually spends its life. MySQL’s Performance Schema is the equivalent set of measurements with a different vocabulary. events_statements_summary_by_digest for query patterns, events_waits_summary_global_by_event_name for wait events, and file_summary_by_event_name for IO patterns. Enable the right consumers, ship them to a TSDB, and you have a feedback loop on every query that ran.
The conventional advice usually lands somewhere in this set of recommendations, all of which sound correct in isolation:
- Dashboards full of average-latency metrics. The average is the metric that hides the problem; p95, p99 and p99.9 are what your customers actually feel.
- Alerts on absolute thresholds (“CPU > 80%”) without rate-of-change context. A database at 85% CPU all afternoon is healthy; the same database climbing 10% per minute is mid-incident.
- Sampling intervals longer than the incident timeline. A 60-second poll on a 90-second outage shows two data points, neither of which is the incident.
- Storing query text alongside metrics in unsecured logs. PII, secrets and customer identifiers all leak this way; redaction has to happen before the metric leaves the database host.
What we have learned to actually believe
pg_stat_statements is the single most valuable extension PostgreSQL ships. It aggregates query execution stats by normalised SQL text, giving you mean, total and standard-deviation timings without any per-query overhead worth measuring. The standard reset cadence is daily; we prefer hourly snapshots stored externally for trend analysis across deployments. Slow-query logging is the cheapest form of observability and the one most teams misconfigure. The threshold should be set low enough to catch the queries you actually want to fix (we typically use 200 ms for OLTP, 5 seconds for OLAP) and the log itself should be shipped, parsed and aggregated, not tailed by a human. pg_stat_monitor, pt-query-digest and Datadog Database Monitoring all do this well.
We took a call from a customer whose Grafana boards showed all green during what was, by the time we arrived, a forty-minute partial outage. The wait-event panel was missing entirely. After we added it the next day, the same dashboard would have shown the issue (heavy LWLock:WALWriteLock) within thirty seconds. The fix was three lines of YAML and a one-line PromQL query.
# Prometheus + postgres_exporter custom metric for replication lag
# (excerpt of queries.yaml)
pg_replication:
query: |
SELECT CASE WHEN pg_is_in_recovery()
THEN extract(epoch FROM now() - pg_last_xact_replay_timestamp())
ELSE 0 END AS lag_seconds
metrics:
- lag_seconds:
usage: "GAUGE"
description: "Replica replay lag in seconds"
Why the distinction matters in production
pg_stat_statements is the single most valuable extension PostgreSQL ships. It aggregates query execution stats by normalised SQL text, giving you mean, total and standard-deviation timings without any per-query overhead worth measuring. The standard reset cadence is daily; we prefer hourly snapshots stored externally for trend analysis across deployments. Wait events are the right primitive for diagnosing a busy PostgreSQL system. pg_stat_activity exposes wait_event_type and wait_event for every active backend, sampled cheaply enough that you can poll once per second without measurable overhead. Aggregating that sample over time gives you a flame graph of where the database actually spends its life.
-- PostgreSQL wait-event flame source
SELECT wait_event_type, wait_event, COUNT(*) AS sessions
FROM pg_stat_activity
WHERE state = 'active' AND pid <> pg_backend_pid()
GROUP BY 1,2
ORDER BY sessions DESC;
-- Top queries by total time over a window
SELECT calls,
round(total_exec_time::numeric, 0) AS total_ms,
round(mean_exec_time::numeric, 1) AS mean_ms,
round(stddev_exec_time::numeric, 1) AS stddev_ms,
LEFT(query, 100) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
What we recommend you do instead
Where possible, treat pg_stat_statements as a code review concern: a peer should challenge configuration changes the same way they would challenge an application code change, with explicit acceptance criteria and a documented rollback plan. This single cultural shift removes more outages than any individual parameter tweak.
Finally, remember that documentation is a force multiplier. Every diagnostic command, every tuning decision, every runbook step that lives in a shared system rather than in someone’s head is a step closer to a PostgreSQL estate that does not depend on a single hero engineer being awake.
It is worth emphasising that pg_stat_statements in PostgreSQL is not a static topic. The engine, the cloud platforms it runs on, the storage technologies it uses and the workloads pushed through it all evolve, which means any configuration you ship today should be considered a snapshot rather than a permanent answer.
When MinervaDB takes over a PostgreSQL estate as part of an enterprise support engagement, the first thirty days almost always include a structured review of pg_stat_statements, because the gains here are usually larger and faster than any other intervention available in the first month.
MinervaDB engineers maintain a library of internal runbooks for PostgreSQL that are updated whenever a customer engagement reveals a new pattern; if you would like a copy of the relevant runbook for pg_stat_statements, contact our team and we will share the sanitised version that we use during incident response.
Where possible, treat pg_stat_statements as a code review concern: a peer should challenge configuration changes the same way they would challenge an application code change, with explicit acceptance criteria and a documented rollback plan. This single cultural shift removes more outages than any individual parameter tweak.
Finally, remember that documentation is a force multiplier. Every diagnostic command, every tuning decision, every runbook step that lives in a shared system rather than in someone’s head is a step closer to a PostgreSQL estate that does not depend on a single hero engineer being awake.
It is worth emphasising that pg_stat_statements in PostgreSQL is not a static topic. The engine, the cloud platforms it runs on, the storage technologies it uses and the workloads pushed through it all evolve, which means any configuration you ship today should be considered a snapshot rather than a permanent answer.
- Wait events first, CPU second. The wait-event picture tells you what the database is waiting on; CPU just confirms it is busy.
- Alert on percentiles and rates of change, not absolute thresholds.
- Every alert ships with a runbook link and a clear next action.
- Sample fast, retain coarse. One-second granularity for the last hour, one-minute for the last week, one-hour for the last year.
- Treat broken exporters as outages, not informational events.
What separates a junior PostgreSQL operator from a senior one is not knowing more facts about pg_stat_statements; it is knowing which facts matter for the workload in front of them.
Frequently asked questions
How quickly can MinervaDB engineers respond to a production incident on this topic?
MinervaDB runs a 24×7 support practice with documented SLAs that vary by contract; for SEV-1 incidents on supported clusters the first engineer response is measured in minutes, not hours.
What is your typical engagement model for a one-off review?
A typical engagement starts with a short discovery call, a focused review (architecture, performance, security, cost, or topic-specific), and a written assessment with prioritised recommendations. We can then either hand it back to your team to execute, or stay engaged to implement.
Do you publish runbooks and documentation we can keep after the engagement?
Yes. Documentation and runbooks are deliverables, not afterthoughts. Everything we produce is yours to keep, with no proprietary tooling lock-in.
Do you support both self-managed and cloud-managed deployments?
Yes. We work across PostgreSQL, MySQL/MariaDB, MongoDB, SQL Server, ClickHouse, Cassandra, Redis/Valkey, Milvus, Trino and SAP HANA, on bare-metal, virtualised infrastructure, Kubernetes, and managed cloud services (Aurora, RDS, Azure SQL, Cloud SQL).
References and resources
Authoritative external references
- PostgreSQL high availability
- PostgreSQL official documentation
- Patroni HA template documentation
- PostgreSQL EXPLAIN reference
- PostgreSQL MVCC concurrency control
- PostgreSQL autovacuum configuration
Related MinervaDB articles and services
When to bring MinervaDB into the conversation
MinervaDB engineers spend their days inside production PostgreSQL environments — tuning, troubleshooting, migrating, and on-call. The articles on this site reflect what we have actually seen, in real customer engagements, not what reads well in a slide deck.
How we typically help:
- Round-the-clock production support with documented SLAs and escalation paths.
- Performance engineering and capacity planning for serious OLTP and analytical workloads.
- High-availability and disaster-recovery architecture, across regions and clouds.
- Cost optimisation for cloud database bills and self-managed platforms.
- Migration and upgrade engineering with rehearsed cutover and rollback plans.
- Security, audit, and compliance readiness for regulated industries.
Reach us: contact@minervadb.com or minervadb.com/contact. Reference pg_stat_statements when you write and the consulting engineer will arrive with the right context.
MinervaDB — The WebScale Database Infrastructure Operations Experts.