Installation and configuration of pgBadger

pgBadger is a PostgreSQL log analyzer that can be used to analyze and summarize the PostgreSQL log files to help identify performance issues and troubleshoot problems. Here are the steps to install and configure pgBadger on CentOS:

  1. Install pgBadger: Use the following command to install pgBadger on CentOS:

sudo yum install -y epel-release
sudo yum install -y pgbadger

2. Configure PostgreSQL to log to a file: In order to use pgBadger, you need to configure PostgreSQL to log its activity to a file. You can do this by modifying the postgresql.conf file as follows:

# Modify the following parameters in postgresql.conf
logging_collector = on
log_directory = '/var/lib/pgsql/13/data/pg_log'
log_filename = 'postgresql-%a.log'
log_rotation_age = 1d
log_rotation_size = 0

  1. Generate PostgreSQL log files: After configuring PostgreSQL to log to a file, restart the PostgreSQL service to apply the changes. PostgreSQL will now generate log files in the specified directory.
  2. Analyze PostgreSQL log files using pgBadger: Use the following command to analyze the PostgreSQL log files using pgBadger:

pgbadger /var/lib/pgsql/13/data/pg_log/postgresql-*.log -o report.html

This command analyzes all PostgreSQL log files in the specified directory and generates an HTML report named “report.html” in the current directory.

Here is an example of the output generated by pgBadger:

=============================================
General Statistics
=============================================
Total parsed log entries: 14417
Log line format: 1
Log begin time: 2022-04-01 00:00:00
Log end time: 2022-04-02 00:00:00
Total time covered by log: 1 day(s) 00:00:00
=============================================
Top 10 Slowest Queries
=============================================
1. 2.4 % | 98.4 ms | 1.2 % | 1095 | SELECT * FROM users WHERE id = $1
2. 1.8 % | 76.1 ms | 1.6 % | 2335 | SELECT * FROM orders WHERE user_id = $1
3. 1.2 % | 50.6 ms | 1.1 % | 2345 | UPDATE users SET last_login = $1 WHERE id = $2
4. 1.1 % | 46.2 ms | 2.0 % | 1050 | INSERT INTO orders (user_id, product_id, price) VALUES ($1, $2, $3)
5. 0.9 % | 38.2 ms | 0.8 % | 4223 | SELECT COUNT(*) FROM users
6. 0.8 % | 35.1 ms | 1.3 % | 3050 | SELECT * FROM products WHERE id = $1
7. 0.6 % | 25.4 ms | 0.7 % | 1950 | DELETE FROM users WHERE id = $1
8. 0.5 % | 21.2 ms | 0.3 % | 2132 | SELECT * FROM orders WHERE status = 'pending'
9. 0.4 % | 18.1 ms | 0.5 % | 1536 | SELECT * FROM users WHERE email = $1
10. 0.3 % | 14.5 ms | 0.1 % | 357 | SELECT COUNT(*) FROM orders WHERE status = 'shipped

In the above example output, the “General Statistics” section provides a summary of the log files analyzed, including the total number of log entries, the log line format, and the start and end times of the log files.

The “Top 10 Slowest Queries” section lists the top 10 queries that took the longest time to execute. For each query, pgBadger provides the percentage of the total query time, the total query time in milliseconds, the percentage of the total number of queries, and the total number of times the query was executed.

Using this information, you can identify queries that are taking a long time to execute and optimize them to improve the overall database performance.

In addition to the above example, pgBadger can also generate other types of reports, such as graphs and charts, that provide a visual representation of the database activity. These reports can be useful for identifying trends and patterns in the database activity and for monitoring the performance of the database over time.

Overall, pgBadger is a powerful tool for analyzing PostgreSQL log files and identifying performance issues. By regularly analyzing the log files using pgBadger and optimizing the queries and database configuration based on the results, you can ensure that your PostgreSQL database is running efficiently and performing at its best.

About Shiv Iyer 477 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.