Step-by-step PostgreSQL 12 Streaming Replication on Ubuntu

Step-by-step Installation and Configuration of Streaming Replication in PostgreSQL 12


In this post we have explained how to implement step-by-step PostgreSQL 12 Streaming Replication on Ubuntu 20.04 (Codename: focal). PostgreSQL support several types of replication solutions for performance, scalability and high availability. One of the most commonly used and highly recommended PostgreSQL replication solution is Write-Ahead Log (WAL) Shipping. In this solution, the PostgreSQL standby node is built using file-based log shipping or streaming replication, even combination of both whenever appropriate. In a streaming replication, a standby (PostgreSQL replication slave) is tuned to connect with PostgreSQL Primary / Master Node, to stream WAL records to the standby as and when they are generated, without waiting for the WAL file to be filled. PostgreSQL streaming replication by default is asynchronous in which data is written to the standby server(s) after a transaction has been committed on the primary server. This means there are some associated risks on PostgreSQL data reliability / durability because primary / master node is not bothered about the state of data successfully replicated to standby node, So master server crashes, any uncommitted transactions may not be replicated and this can cause data loss. In this post we have also explained about PostgreSQL 12 “replication slots” for standby node to avoid the master server from recycling old WAL segments before the standby has received them. P.S: Replication slots retain only the number of segments known to be needed.

PostgreSQL Primary / Master Node and Standby Node (replication slave) used in the streaming replication

  • PostgreSQL Primary / Master Node – 167.172.114.63
  • PostgreSQL Standby Node (replication slave) – 167.172.123.158

Linux distribution in PostgreSQL Master Instance

root@PostgreSQL-Master:~# lsb_release -a 
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.1 LTS
Release:        20.04
Codename:       focal
root@PostgreSQL-Master:~#

Linux distribution in PostgreSQL Standby Instance

root@PostgreSQL-Standby:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.1 LTS
Release:        20.04
Codename:       focal
root@PostgreSQL-Standby:~#

PostgreSQL 12 in the Master

root@PostgreSQL-Master:~# su postgres
postgres@PostgreSQL-Master:/root$ 
postgres@PostgreSQL-Master:/root$ 
postgres@PostgreSQL-Master:/root$ psql
psql (12.4 (Ubuntu 12.4-1.pgdg20.04+1))
Type "help" for help.
 
postgres=# show server_version;
          server_version          
----------------------------------
 12.4 (Ubuntu 12.4-1.pgdg20.04+1)
(1 row)
 
postgres=#

PostgreSQL 12 in the Standby

root@PostgreSQL-Standby:~# su postgres
postgres@PostgreSQL-Standby:/root$ 
postgres@PostgreSQL-Standby:/root$ psql
psql (12.4 (Ubuntu 12.4-1.pgdg20.04+1))
Type "help" for help.

postgres=# show server_version;
          server_version          
----------------------------------
 12.4 (Ubuntu 12.4-1.pgdg20.04+1)
(1 row)

postgres=#

Configure the PostgreSQL Primary / Master Node

On the PostgreSQL Primary / Master Node configure the IP address(es) on which the master server will listen to for connections from clients.

postgres=# ALTER SYSTEM SET listen_addresses TO '*';
ALTER SYSTEM

Create a Replication user in PostgreSQL Master

Create a user called replicator to facilitate connections between PostgreSQL primary and standby node using the createuser utility program:

postgres@PostgreSQL-Master:~$ createuser --replication -P -e replicator
Enter password for new role: 
Enter it again: 
SELECT pg_catalog.set_config('search_path', '', false);
CREATE ROLE replicator PASSWORD 'md57a53732cdbab04edef940553c841dc81' NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN REPLICATION;
postgres@PostgreSQL-Master:~$

Edit pg_hba.conf file

After creating the replication user let’s configure the user to access our standby node by editing and configuring rules in pg_hba.conf file:

postgres@PostgreSQL-Master:/root$ cd /etc/postgresql/12/main/
postgres@PostgreSQL-Master:/etc/postgresql/12/main$ 
postgres@PostgreSQL-Master:/etc/postgresql/12/main$ vi pg_hba.conf

# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     peer
host    replication     all             127.0.0.1/32            md5
host    replication     all             ::1/128                 md5
host    replication     replicator      167.172.123.158/0       md5

Restart or reload PostgreSQL 12 in the Master / Primary Node for the changes to take effect by running:

root@PostgreSQL-Master:~# 
root@PostgreSQL-Master:~# systemctl reload postgresql.service
root@PostgreSQL-Master:~#

Backup standby

Backup PostgreSQL in the standby database node before deleting the data:

root@PostgreSQL-Standby:~# sudo systemctl stop postgresql.service
root@PostgreSQL-Standby:~# 
root@PostgreSQL-Standby:~# sudo systemctl status postgresql.service
● postgresql.service - PostgreSQL RDBMS
     Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Sun 2020-10-11 18:36:57 UTC; 14s ago
   Main PID: 4071 (code=exited, status=0/SUCCESS)
 
Oct 11 11:13:32 PostgreSQL-Standby systemd[1]: Starting PostgreSQL RDBMS...
Oct 11 11:13:32 PostgreSQL-Standby systemd[1]: Finished PostgreSQL RDBMS.
Oct 11 18:36:57 PostgreSQL-Standby systemd[1]: postgresql.service: Succeeded.
Oct 11 18:36:57 PostgreSQL-Standby systemd[1]: Stopped PostgreSQL RDBMS.
root@PostgreSQL-Standby:~# 
root@PostgreSQL-Standby:~# 
root@PostgreSQL-Standby:~# cp -R /var/lib/postgresql/12/main /var/lib/postgresql/12/stand-by-bkp
root@PostgreSQL-Standby:~# rm -rf /var/lib/postgresql/12/main/*
root@PostgreSQL-Standby:~#

Backup from primary to standby

Backup and restore the primary database on to standby node using pg_basebackup utility tool:

postgres@PostgreSQL-Standby:~$ pg_basebackup -h 167.172.114.63 -D /var/lib/postgresql/12/main -U replicator -P -v -R -X stream -C -S node2
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created replication slot "node2"
24554/24554 kB (100%), 1/1 tablespace                                         
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
postgres@PostgreSQL-Standby:~$

-h  – Represents the node that will be backed up. In our case this will be our primary node.

-D – The data directory where the backup will be transferred to.

-U – Connection User

-P – Turns on progress reporting.

-v – Enables verbose mode

-R – Creates a standby.signal file and append connection settings to postgresql.auto.conf

-X – includes the required write-ahead log files in the backup

-C – enables the creation of a replication slot named by the -S option before starting the backup.

-S – specifies the replication slot name.

Monitor Replication Slot from Primary / Master Node

Confirm replication slot running from primary / master node, You can check the slot_name node2 in result of the query:

Expanded display is on.
postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+----------
slot_name           | node2
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 25889
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/3000060
confirmed_flush_lsn | 
 
postgres=# create database mdb1;
CREATE DATABASE
postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+----------
slot_name           | node2
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 25889
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/3000990
confirmed_flush_lsn | 
 
postgres=#

Start PostgreSQL instance in the standby node:

root@PostgreSQL-Standby:~# sudo systemctl start postgresql

Validate Replication happening between Primary Node / Master and Standby / Slave:

postgres=# SELECT * FROM pg_stat_wal_receiver;
-[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 24662
status                | streaming
receive_start_lsn     | 0/3000000
receive_start_tli     | 1
received_lsn          | 0/3000060
received_tli          | 1
last_msg_send_time    | 2020-10-12 18:55:59.900604+00
last_msg_receipt_time | 2020-10-12 18:55:59.901066+00
latest_end_lsn        | 0/3000060
latest_end_time       | 2020-10-12 18:55:29.888351+00
slot_name             | node2
sender_host           | 167.172.114.63
sender_port           | 5432
conninfo              | user=replicator password=******** channel_binding=prefer dbname=replication host=167.172.114.63 port=5432 fallback_application_name=12/main sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any
 
postgres=# SELECT * FROM pg_stat_wal_receiver;
-[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 24662
status                | streaming
receive_start_lsn     | 0/3000000
receive_start_tli     | 1
received_lsn          | 0/3000990
received_tli          | 1
last_msg_send_time    | 2020-10-12 18:56:41.768859+00
last_msg_receipt_time | 2020-10-12 18:56:41.76974+00
latest_end_lsn        | 0/3000990
latest_end_time       | 2020-10-12 18:56:41.768859+00
slot_name             | node2
sender_host           | 167.172.114.63
sender_port           | 5432
conninfo              | user=replicator password=******** channel_binding=prefer dbname=replication host=167.172.114.63 port=5432 fallback_application_name=12/main sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

To monitor how much is lag behind the slots:

postgres=# SELECT redo_lsn, slot_name,restart_lsn, 
postgres-# round((redo_lsn-restart_lsn) / 1024 / 1024 / 1024, 2) AS GB_behind
postgres-# FROM pg_control_checkpoint(), pg_replication_slots;
-[ RECORD 1 ]----------
redo_lsn    | 0/30009C8
slot_name   | node2
restart_lsn | 0/3000AB0
gb_behind   | 0.00
 
postgres=#

Conclusion 

PostgreSQL powers today several mission critical database infrastructure operations so building highly available, reliable and fault-tolerant PostgreSQL Operations is one the core responsibility of DBAs / Database Reliability Engineers.

Business FunctionContact
☎ CONTACT GLOBAL SALES (24*7)📞 (844) 588-7287 (USA)
📞 (415) 212-6625 (USA)
📞 (778) 770-5251 (Canada)
☎ TOLL FREE PHONE (24*7)📞 (844) 588-7287
🚩 MINERVADB FAX+1 (209) 314-2364
📨 MinervaDB Email - General / Sales / Consultingcontact@minervadb.com
📨 MinervaDB Email - Supportsupport@minervadb.com
📨 MinervaDB Email -Remote DBAremotedba@minervadb.com
📨 Shiv Iyer Email - Founder and Principalshiv@minervadb.com
🏠 CORPORATE ADDRESS: CALIFORNIAMinervaDB Inc.
440 N BARRANCA AVE #9718 COVINA,
CA 91723
🏠 CORPORATE ADDRESS: DELAWAREMinervaDB Inc.,
PO Box 2093 PHILADELPHIA PIKE #3339
CLAYMONT, DE 19703
🏠 CORPORATE ADDRESS: HOUSTONMinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston,
TX, 77043, US
About Shiv Iyer 500 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.