
In PostgreSQL, you can gather statistics only when they become stale by using the pg_stat_statements
extension along with the track_activity_query_size
configuration parameter. Follow these steps to achieve this:
-
Enable the pg_stat_statements Extension:
- Ensure that the
pg_stat_statements
extension runs in your PostgreSQL database. - If it’s not already enabled, you can do so by executing the following SQL command as a superuser:
CREATE EXTENSION pg_stat_statements;
-
Configure the track_activity_query_size Parameter:
- Next, Set the track_activity_query_size parameter in your PostgreSQL configuration file (postgresql.conf).
- This parameter determines the maximum size of the query text tracked by pg_stat_statements.
- Increase the value to a sufficient size to capture the complete query text. For example, set it to 1024 to allow tracking of queries up to 1KB in size.
- After that, save the changes to the configuration file and restart the PostgreSQL server for the new configuration to take effect.
3. Query the pg_stat_statements View:
- Execute queries against the pg_stat_statements view to retrieve statistical information about executed queries.
- The view contains columns such as queryid, query, calls, total_time, and last_execution_time.
- By examining the last_execution_time column, you can determine the time of the last execution for each query.
-
Determine Staleness Criteria:
- Define your criteria for determining when statistics are considered stale.
- For example, you can set a threshold such as “if a query has not been executed for a certain duration (e.g., 24 hours), consider its statistics stale.”
-
Schedule Statistics Collection:
- Implement a scheduled task or script that regularly checks the
pg_stat_statements
view.Use your defined criteria to identify queries with stale statistics. - When the script detects stale statistics, execute the
ANALYZE
command specifically for that query to refresh its statistics. - You can use dynamic SQL along with the
EXECUTE
command in PostgreSQL to run theANALYZE
command for each identified query.
By using the pg_stat_statements extension and periodically monitoring the pg_stat_statements view for staleness, you can selectively gather statistics for queries that have not been executed for a specified duration. This approach allows you to maintain up-to-date statistics while minimizing the overhead of analyzing all queries.