PostgreSQL 16 High Availability Setup with Pgpool-II

PostgreSQL 16 High Availability Setup with Pgpool-II: Complete Implementation Guide



High availability database systems are critical for modern applications that require minimal downtime and seamless failover capabilities. PostgreSQL 16, combined with Pgpool-II, provides a robust solution for achieving database high availability through connection pooling, load balancing, and automatic failover mechanisms.

Understanding PostgreSQL 16 High Availability Architecture

PostgreSQL 16 introduces enhanced replication features and improved performance metrics that make it ideal for high availability deployments. When combined with Pgpool-II, you can create a fault-tolerant database cluster that automatically handles primary server failures while maintaining data consistency.

The typical architecture consists of:

  • Primary PostgreSQL server (read/write operations)
  • One or more standby servers (read-only replicas)
  • Pgpool-II middleware layer for connection management
  • Shared storage or streaming replication for data synchronization

Prerequisites and System Requirements

Before implementing the high availability setup, ensure your environment meets these requirements:

  • Ubuntu 22.04 LTS or CentOS 8+ (minimum 3 servers)
  • PostgreSQL 16 installed on all nodes
  • Pgpool-II 4.4+ installed
  • Network connectivity between all nodes
  • Sufficient disk space for database storage and WAL files
  • Root or sudo access on all servers

Step 1: PostgreSQL 16 Primary Server Configuration

First, configure the primary PostgreSQL server with appropriate replication settings:

# Install PostgreSQL 16
sudo apt update
sudo apt install postgresql-16 postgresql-contrib-16

# Configure postgresql.conf
sudo nano /etc/postgresql/16/main/postgresql.conf
# Primary server configuration
listen_addresses = '*'
port = 5432
max_connections = 200
shared_buffers = 256MB
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
hot_standby = on
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/16/archive/%f'

Configure pg_hba.conf for replication access:

sudo nano /etc/postgresql/16/main/pg_hba.conf
# Replication connections
host replication replicator 192.168.1.0/24 md5
host all pgpool 192.168.1.0/24 md5
host all all 192.168.1.0/24 md5

Create replication user and restart PostgreSQL:

-- Connect to PostgreSQL as superuser
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'replica_password';
CREATE USER pgpool WITH ENCRYPTED PASSWORD 'pgpool_password';
GRANT pg_monitor TO pgpool;
sudo systemctl restart postgresql
sudo systemctl enable postgresql

Step 2: Standby Server Configuration

Configure the standby servers for streaming replication:

# Stop PostgreSQL on standby server
sudo systemctl stop postgresql

# Remove existing data directory
sudo rm -rf /var/lib/postgresql/16/main/*

# Create base backup from primary
sudo -u postgres pg_basebackup -h 192.168.1.10 -D /var/lib/postgresql/16/main -U replicator -W -v -P

Create standby.signal file and configure recovery:

# Create standby signal
sudo -u postgres touch /var/lib/postgresql/16/main/standby.signal

# Configure postgresql.conf for standby
sudo nano /etc/postgresql/16/main/postgresql.conf
# Standby server configuration
primary_conninfo = 'host=192.168.1.10 port=5432 user=replicator password=replica_password'
hot_standby = on
max_connections = 200

Start the standby server:

sudo systemctl start postgresql
sudo systemctl enable postgresql

Step 3: Pgpool-II Installation and Configuration

Install Pgpool-II on a dedicated server or the primary node:

# Install Pgpool-II
sudo apt install pgpool2

# Create pgpool configuration directory
sudo mkdir -p /etc/pgpool2
sudo chown postgres:postgres /etc/pgpool2

Configure the main Pgpool-II settings:

sudo nano /etc/pgpool2/pgpool.conf
# Connection settings
listen_addresses = '*'
port = 9999
socket_dir = '/var/run/postgresql'

# Backend connections
backend_hostname0 = '192.168.1.10'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/16/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = '192.168.1.11'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/16/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'

# Authentication
enable_pool_hba = on
pool_passwd = 'pool_passwd'

# Load balancing
load_balance_mode = on
master_slave_mode = on
master_slave_sub_mode = 'stream'

# Health checking
health_check_period = 10
health_check_timeout = 20
health_check_user = 'pgpool'
health_check_password = 'pgpool_password'
health_check_database = 'postgres'

# Failover settings
failover_command = '/etc/pgpool2/failover.sh %d %h %p %D %m %H %M %P %r %R'
follow_master_command = '/etc/pgpool2/follow_master.sh %d %h %p %D %m %H %M %P %r %R'

# Watchdog settings
use_watchdog = on
wd_hostname = '192.168.1.12'
wd_port = 9000

# Other watchdog nodes
other_pgpool_hostname0 = '192.168.1.13'
other_pgpool_port0 = 9999
other_wd_port0 = 9000

Step 4: Implementing Failover Scripts

Create automated failover scripts for seamless primary server switching:

sudo nano /etc/pgpool2/failover.sh
#!/bin/bash
# Failover script for Pgpool-II

FALLING_NODE=$1
OLD_PRIMARY=$2
OLD_PRIMARY_PORT=$3
NEW_PRIMARY=$4

# Log failover event
echo "$(date): Failover triggered - Node $FALLING_NODE failed" >> /var/log/pgpool_failover.log

# Promote standby to primary
if [ $FALLING_NODE -eq 0 ]; then
    ssh postgres@$NEW_PRIMARY "pg_ctl promote -D /var/lib/postgresql/16/main"
    echo "$(date): Promoted $NEW_PRIMARY to primary" >> /var/log/pgpool_failover.log
fi

exit 0
sudo nano /etc/pgpool2/follow_master.sh
#!/bin/bash
# Follow master script for Pgpool-II

FALLING_NODE=$1
NEW_PRIMARY=$2
NEW_PRIMARY_PORT=$3
NEW_PRIMARY_DATA_DIR=$4

# Log follow master event
echo "$(date): Follow master triggered for node $FALLING_NODE" >> /var/log/pgpool_follow_master.log

# Reconfigure standby to follow new primary
if [ $FALLING_NODE -ne 0 ]; then
    ssh postgres@$NEW_PRIMARY "pg_ctl restart -D $NEW_PRIMARY_DATA_DIR"
fi

exit 0

Make scripts executable:

sudo chmod +x /etc/pgpool2/failover.sh
sudo chmod +x /etc/pgpool2/follow_master.sh
sudo chown postgres:postgres /etc/pgpool2/*.sh

Step 5: Configuring Authentication and Security

Set up password authentication for Pgpool-II:

# Generate encrypted passwords
sudo -u postgres pg_md5 -m -u pgpool pgpool_password

# Create pool_passwd file
sudo nano /etc/pgpool2/pool_passwd
pgpool:md5d6a35858d61d85e4a82ab1fb044aba9d5
replicator:md5f7c3bc1d808e04732adf679965ccc34ca

Configure pool_hba.conf for client authentication:

sudo nano /etc/pgpool2/pool_hba.conf
# Pool HBA configuration
local   all         all                               trust
host    all         all         127.0.0.1/32          md5
host    all         all         192.168.1.0/24        md5
host    all         all         ::1/128               md5

Step 6: Testing High Availability Setup

Start Pgpool-II and test the configuration:

# Start Pgpool-II
sudo systemctl start pgpool2
sudo systemctl enable pgpool2

# Check Pgpool-II status
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_NODES;"

Test failover functionality:

# Simulate primary failure
sudo systemctl stop postgresql

# Check automatic failover
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_NODES;"

# Verify new primary is accepting writes
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "CREATE TABLE test_failover(id serial, data text);"

Step 7: Monitoring and Maintenance

Implement comprehensive monitoring for your high availability setup:

# Create monitoring script
sudo nano /usr/local/bin/pgpool_monitor.sh
#!/bin/bash
# PostgreSQL and Pgpool-II monitoring script

LOG_FILE="/var/log/pgpool_monitor.log"
EMAIL="admin@yourdomain.com"

# Check Pgpool-II status
if ! systemctl is-active --quiet pgpool2; then
    echo "$(date): Pgpool-II is not running" >> $LOG_FILE
    echo "Pgpool-II service is down" | mail -s "Database Alert" $EMAIL
fi

# Check PostgreSQL nodes
for node in 192.168.1.10 192.168.1.11; do
    if ! pg_isready -h $node -p 5432 -q; then
        echo "$(date): PostgreSQL node $node is not responding" >> $LOG_FILE
        echo "PostgreSQL node $node is down" | mail -s "Database Alert" $EMAIL
    fi
done

# Check replication lag
LAG=$(sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()));" 2>/dev/null)

if [ ! -z "$LAG" ] && [ $(echo "$LAG > 300" | bc) -eq 1 ]; then
    echo "$(date): Replication lag is $LAG seconds" >> $LOG_FILE
    echo "High replication lag detected: $LAG seconds" | mail -s "Database Alert" $EMAIL
fi

Set up automated monitoring:

sudo chmod +x /usr/local/bin/pgpool_monitor.sh
sudo crontab -e
# Monitor PostgreSQL HA setup every 5 minutes
*/5 * * * * /usr/local/bin/pgpool_monitor.sh

Performance Optimization and Best Practices

Optimize your PostgreSQL 16 and Pgpool-II configuration for maximum performance:

-- PostgreSQL performance tuning
ALTER SYSTEM SET shared_buffers = '25% of RAM';
ALTER SYSTEM SET effective_cache_size = '75% of RAM';
ALTER SYSTEM SET maintenance_work_mem = '256MB';
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET wal_buffers = '16MB';
ALTER SYSTEM SET default_statistics_target = 100;
SELECT pg_reload_conf();

Configure Pgpool-II connection pooling:

# Connection pooling optimization
num_init_children = 32
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0

Troubleshooting Common Issues

Address frequent high availability setup problems:

Split-brain scenarios:

# Check for split-brain condition
sudo -u postgres psql -h 192.168.1.10 -p 5432 -c "SELECT pg_is_in_recovery();"
sudo -u postgres psql -h 192.168.1.11 -p 5432 -c "SELECT pg_is_in_recovery();"

Replication lag issues:

-- Monitor replication status
SELECT client_addr, state, sent_lsn, write_lsn, flush_lsn, replay_lsn,
       write_lag, flush_lag, replay_lag
FROM pg_stat_replication;

Connection pool exhaustion:

# Monitor active connections
sudo -u postgres psql -h localhost -p 9999 -U pgpool -d postgres -c "SHOW POOL_PROCESSES;"

This comprehensive PostgreSQL 16 high availability setup with Pgpool-II provides robust failover capabilities, load balancing, and connection pooling for production environments. Regular monitoring, testing, and maintenance ensure optimal performance and reliability of your database infrastructure.

The implementation covers all essential aspects from initial configuration to advanced monitoring, enabling you to deploy a production-ready high availability PostgreSQL cluster that can handle enterprise workloads with minimal downtime and automatic recovery capabilities.

 

Creating High Availability with PostgreSQL 14 Stream Replication: A Step-by-Step Guide

 

Implementing High Availability in PostgreSQL: A Step-By-Step Guide to Setting Up Streaming Replication

 

How to implement connection pooling in PostgreSQL?

About MinervaDB Corporation 72 Articles
A boutique private-label enterprise-class MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse consulting, 24*7 consultative support and remote DBA services company with core expertise in performance, scalability and high availability. Our consultants have several years of experience in architecting and building web-scale database infrastructure operations for internet properties from diversified verticals like CDN, Mobile Advertising Networks, E-Commerce, Social Media Applications, SaaS, Gaming and Digital Payment Solutions. Our globally distributed team working on multiple timezones guarantee 24*7 Consulting, Support and Remote DBA Services delivery for MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse.

Be the first to comment

Leave a Reply