PostgreSQL HA Setup with Pgpool-II :Ultimate Configuration Guide

Table of Contents

PostgreSQL 16 High Availability Setup with Pgpool-II: Complete Implementation Guide

High-availability database systems are essential for modern applications that demand minimal downtime and seamless failover. PostgreSQL 16, when combined with Pgpool-II, offers a robust solution. It enables connection pooling, load balancing, and automatic failover.

Understanding PostgreSQL HA Architecture

PostgreSQL 16 introduces enhanced replication and better performance, making it ideal for HA deployments. By using Pgpool-II, you can build a fault-tolerant cluster. This setup automatically handles primary failures while ensuring data consistency.

The typical architecture consists of:

Primary PostgreSQL server (read/write operations)
One or more standby servers (read-only replicas)
Pgpool-II middleware layer for connection management
Shared storage or streaming replication for data synchronisation

Prerequisites and System Requirements

Before implementing the high availability setup, ensure your environment meets these requirements:

Ubuntu 22.04 LTS or CentOS 8+ (minimum 3 servers)
PostgreSQL 16 installed on all nodes
Pgpool-II 4.4+ installed
Network connectivity between all nodes
Sufficient disk space for database storage and WAL files
Root or sudo access on all servers

Step 1: PostgreSQL 16 Primary Server Configuration

First, configure the primary PostgreSQL server with appropriate replication settings:

# Install PostgreSQL 16
sudo apt update
sudo apt install postgresql-16 postgresql-contrib-16

# Configure postgresql.conf
sudo nano /etc/postgresql/16/main/postgresql.conf

# Primary server configuration
listen_addresses = '*'
port = 5432
max_connections = 200
shared_buffers = 256MB
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/16/archive/%f'

Configure pg_hba.conf for replication access:

sudo nano /etc/postgresql/16/main/pg_hba.conf

# Replication connections
host replication replicator 192.168.1.0/24 md5
host all pgpool 192.168.1.0/24 md5
host all all 192.168.1.0/24 md5

Create replication user and restart PostgreSQL:

-- Connect to PostgreSQL as superuser
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replica_password';
CREATE USER pgpool WITH ENCRYPTED PASSWORD 'pgpool_password';
GRANT pg_monitor TO pgpool;

sudo systemctl restart postgresql
sudo systemctl enable postgresql

Step 2: Standby Server Configuration

Configure the standby servers for streaming replication:

# Stop PostgreSQL on standby server
sudo systemctl stop postgresql

# Remove existing data directory
sudo rm -rf /var/lib/postgresql/16/main/*

# Create base backup from primary
sudo -u postgres pg_basebackup -h 192.168.1.10 -D /var/lib/postgresql/16/main -U replicator -W -v -P

Create standby.signal file and configure recovery:

# Create standby signal
sudo -u postgres touch /var/lib/postgresql/16/main/standby.signal

# Configure postgresql.conf for standby
sudo nano /etc/postgresql/16/main/postgresql.conf

# Standby server configuration
primary_conninfo = 'host=192.168.1.10 port=5432 user=replicator password=replica_password'
hot_standby = on
max_connections = 200

Start the standby server:

sudo systemctl start postgresql
sudo systemctl enable postgresql

Step 3: Pgpool-II Installation and Configuration

Install Pgpool-II on a dedicated server or the primary node:

# Install Pgpool-II
sudo apt install pgpool2

# Create pgpool configuration directory
sudo mkdir -p /etc/pgpool2
sudo chown postgres:postgres /etc/pgpool2

Configure the main Pgpool-II settings:

sudo nano /etc/pgpool2/pgpool.conf

# Connection settings
listen_addresses = '*'
port = 9999
socket_dir = '/var/run/postgresql'

# Backend connections
backend_hostname0 = '192.168.1.10'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/16/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = '192.168.1.11'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/16/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'

# Authentication
enable_pool_hba = on
pool_passwd = 'pool_passwd'

# Load balancing
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'

# Health checking
health_check_period = 10
health_check_timeout = 20
health_check_user = 'pgpool'
health_check_password = 'pgpool_password'
health_check_database = 'postgres'

# Failover settings
failover_command = '/etc/pgpool2/failover.sh %d %h %p %D %m %H %M %P %r %R'
follow_master_command = '/etc/pgpool2/follow_master.sh %d %h %p %D %m %H %M %P %r %R'

# Watchdog settings
use_watchdog = on
wd_hostname = '192.168.1.12'
wd_port = 9000

# Other watchdog nodes
other_pgpool_hostname0 = '192.168.1.13'
other_pgpool_port0 = 9999
other_wd_port0 = 9000

Step 4: Implementing Failover Scripts

Create automated failover scripts for seamless primary server switching:

sudo nano /etc/pgpool2/failover.sh

#!/bin/bash
# Failover script for Pgpool-II

FALLING_NODE=$1
OLD_PRIMARY=$2
OLD_PRIMARY_PORT=$3
NEW_PRIMARY=$4

# Log failover event
echo "$(date): Failover triggered - Node $FALLING_NODE failed" >> /var/log/pgpool_failover.log

# Promote standby to primary
if [ $FALLING_NODE -eq 0 ]; then
    ssh postgres@$NEW_PRIMARY "pg_ctl promote -D /var/lib/postgresql/16/main"
    echo "$(date): Promoted $NEW_PRIMARY to primary" >> /var/log/pgpool_failover.log
fi

exit 0

sudo nano /etc/pgpool2/follow_master.sh

#!/bin/bash
# Follow master script for Pgpool-II

FALLING_NODE=$1
NEW_PRIMARY=$2
NEW_PRIMARY_PORT=$3
NEW_PRIMARY_DATA_DIR=$4

# Log follow master event
echo "$(date): Follow master triggered for node $FALLING_NODE" >> /var/log/pgpool_follow_master.log

# Reconfigure standby to follow new primary
if [ $FALLING_NODE -ne 0 ]; then
    ssh postgres@$NEW_PRIMARY "pg_ctl restart -D $NEW_PRIMARY_DATA_DIR"
fi

exit 0

Make scripts executable:

sudo chmod +x /etc/pgpool2/failover.sh
sudo chmod +x /etc/pgpool2/follow_master.sh
sudo chown postgres:postgres /etc/pgpool2/*.sh

Step 5: Configuring Authentication and Security

Set up password authentication for Pgpool-II:

# Generate encrypted passwords
sudo -u postgres pg_md5 -m -u pgpool pgpool_password

# Create pool_passwd file
sudo nano /etc/pgpool2/pool_passwd

pgpool:md5d6a35858d61d85e4a82ab1fb044aba9d5
replicator:md5f7c3bc1d808e04732adf679965ccc34ca

Configure pool_hba.conf for client authentication:

sudo nano /etc/pgpool2/pool_hba.conf

# Pool HBA configuration
local   all         all                               trust
host    all         all         127.0.0.1/32          md5
host    all         all         192.168.1.0/24        md5
host    all         all         ::1/128               md5

Step 6: Testing High Availability Setup

Start Pgpool-II and test the configuration:

# Start Pgpool-II
sudo systemctl start pgpool2
sudo systemctl enable pgpool2

# Check Pgpool-II status
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_NODES;"

Test failover functionality:

# Simulate primary failure
sudo systemctl stop postgresql

# Check automatic failover
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_NODES;"

# Verify new primary is accepting writes
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "CREATE TABLE test_failover(id serial, data text);"

Step 7: Monitoring and Maintenance

Implement comprehensive monitoring for your high availability setup:

# Create monitoring script
sudo nano /usr/local/bin/pgpool_monitor.sh

#!/bin/bash
# PostgreSQL and Pgpool-II monitoring script

LOG_FILE="/var/log/pgpool_monitor.log"
EMAIL="admin@yourdomain.com"

# Check Pgpool-II status
if ! systemctl is-active --quiet pgpool2; then
    echo "$(date): Pgpool-II is not running" >> $LOG_FILE
    echo "Pgpool-II service is down" | mail -s "Database Alert" $EMAIL
fi

# Check PostgreSQL nodes
for node in 192.168.1.10 192.168.1.11; do
    if ! pg_isready -h $node -p 5432 -q; then
        echo "$(date): PostgreSQL node $node is not responding" >> $LOG_FILE
        echo "PostgreSQL node $node is down" | mail -s "Database Alert" $EMAIL
    fi
done

# Check replication lag
LAG=$(sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()));" 2>/dev/null)

if [ ! -z "$LAG" ] && [ $(echo "$LAG > 300" | bc) -eq 1 ]; then
    echo "$(date): Replication lag is $LAG seconds" >> $LOG_FILE
    echo "High replication lag detected: $LAG seconds" | mail -s "Database Alert" $EMAIL
fi

Set up automated monitoring:

sudo chmod +x /usr/local/bin/pgpool_monitor.sh
sudo crontab -e

# Monitor PostgreSQL HA setup every 5 minutes
*/5 * * * * /usr/local/bin/pgpool_monitor.sh

Performance Optimization and Best Practices

Optimize your PostgreSQL 16 and Pgpool-II configuration for maximum performance:

-- PostgreSQL performance tuning
ALTER SYSTEM SET shared_buffers = '25% of RAM';
ALTER SYSTEM SET effective_cache_size = '75% of RAM';
ALTER SYSTEM SET maintenance_work_mem = '256MB';
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET wal_buffers = '16MB';
ALTER SYSTEM SET default_statistics_target = 100;
SELECT pg_reload_conf();

Configure Pgpool-II connection pooling:

# Connection pooling optimization
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0

Troubleshooting Common Issues

Address frequent high availability setup problems:

Split-brain scenarios:

# Check for split-brain condition
sudo -u postgres psql -h 192.168.1.10 -p 5432 -c "SELECT pg_is_in_recovery();"
sudo -u postgres psql -h 192.168.1.11 -p 5432 -c "SELECT pg_is_in_recovery();"

Replication lag issues:

-- Monitor replication status
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
       write_lag, flush_lag, replay_lag
FROM pg_stat_replication;

Connection pool exhaustion:

# Monitor active connections
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_PROCESSES;"

This PostgreSQL 16 high availability setup with Pgpool-II delivers robust failover, efficient load balancing, and effective connection pooling for production environments. By continuously monitoring, testing, and maintaining the system, you ensure optimal performance and long-term reliability.

Moreover, the implementation thoroughly addresses every critical aspect—from initial configuration to advanced monitoring. As a result, you can confidently deploy a production-ready PostgreSQL HA cluster that supports enterprise workloads with minimal downtime and automated recovery.

FAQ’s

What is Pgpool-II and how does it improve PostgreSQL high availability?
Pgpool-II adds connection pooling, load balancing, and automatic failover to PostgreSQL, making HA clusters more resilient and efficient.
How does streaming replication work in PostgreSQL 16?
Streaming replication continuously sends WAL files from the primary to standby servers, allowing near real-time data synchronization.
How can I test failover in a PostgreSQL + Pgpool-II setup?
You can simulate a primary server failure and observe Pgpool-II promote a standby and update the cluster state automatically.
What are best practices for securing a Pgpool-II high availability deployment?
Use MD5 password authentication, restrict access via pool_hba.conf, and regularly audit node connectivity and roles.
Which Linux distributions are best for PostgreSQL high availability clusters?
Ubuntu 22.04 LTS and CentOS 8+ are commonly recommended due to their stability and wide support for PostgreSQL and Pgpool-II.

To Deepen Your Knowledge on PostgreSQL High Availability

PostgreSQL Performance Tuning: Optimize Speed and Efficiency
Explore techniques to improve PostgreSQL performance through effective caching and query optimization.
Monitoring the Load on a PostgreSQL Server
Learn how to monitor PostgreSQL server load to identify bottlenecks and optimize resource usage.

References

PostgreSQL High Availability, Load Balancing, and Replication

PostgreSQL Monitoring Database Activity

Implementing High Availability in PostgreSQL: A Step-By-Step Guide to Setting Up Streaming Replication

How to implement connection pooling in PostgreSQL?

The WebScale Database Infrastructure Architecture, Engineering and Operations Company

Full-Stack Database Engineering & Cloud DBaaS Solutions for PostgreSQL, MySQL, MongoDB & More | Performance, Scalability, High Availability, Security & Analytics Experts

PostgreSQL 16 High Availability with Pgpool-II: Ultimate Step-by-Step Configuration Guide

PostgreSQL 16 High Availability Setup with Pgpool-II: Complete Implementation Guide

Understanding PostgreSQL HA Architecture

Prerequisites and System Requirements

Step 1: PostgreSQL 16 Primary Server Configuration

Step 2: Standby Server Configuration

Step 3: Pgpool-II Installation and Configuration

Step 4: Implementing Failover Scripts

Step 5: Configuring Authentication and Security

Step 6: Testing High Availability Setup

Step 7: Monitoring and Maintenance

Performance Optimization and Best Practices

Troubleshooting Common Issues

FAQ’s

To Deepen Your Knowledge on PostgreSQL High Availability

References

PostgreSQL 16 High Availability Setup with Pgpool-II: Complete Implementation Guide

Understanding PostgreSQL HA Architecture

Prerequisites and System Requirements

Step 1: PostgreSQL 16 Primary Server Configuration

Step 2: Standby Server Configuration

Step 3: Pgpool-II Installation and Configuration

Step 4: Implementing Failover Scripts

Step 5: Configuring Authentication and Security

Step 6: Testing High Availability Setup

Step 7: Monitoring and Maintenance

Performance Optimization and Best Practices

Troubleshooting Common Issues

FAQ’s

To Deepen Your Knowledge on PostgreSQL High Availability

References

Related Articles

Optimize PostgreSQL Performance: Strategic Use of SKIP_DATABASE_STATS and ONLY_DATABASE_STATS

SQL Window Functions for Time-Series Analysis in PostgreSQL 17

“PostgreSQL rocks, except when it blocks”: Understanding Locks in PostgreSQL