PostgreSQL is another most important open source database besides MySQL in RDBMS world. In this blog we have tried to cover a 3 node PostgreSQL replication setup in Centos 7. Majority of the steps would just remain the same for Ubuntu as well except for the yum based steps that has to be converted appropriately to apt-get based statements and rest of the database level configuration and steps would remain the same.

Our goal from this documentation is to enable you to setup a 3 node PostgreSQL cluster using streaming replication.

Create PostgreSQL Repository

Create the repository for the version of postgres you would like to install the majority of the procedure would remain the same with few additional steps in later versions however the fundamentals for the setup remains the same.

cat > /etc/yum.repos.d/pgdg11.repo << 'EOF'
name=PostgreSQL 11 $releasever - $basearch
cat > /etc/yum.repos.d/pgdg12.repo << 'EOF'
name=PostgreSQL 12 $releasever - $basearch
cat > /etc/yum.repos.d/pgdg13.repo << 'EOF'
name=PostgreSQL 13 $releasever - $basearch

Install Postgres

Below command would install the latest stable version in the repo you would configure if you added all the three repos you would have to give the version of postgres you would like to install

yum -y  install -y
yum -y install -y postgresql**-server

Configure Data Directory

Create the data directory, and change the ownership to postgres user

mkdir -p /var/lib/pgdata/data
chown -R postgres:postgres /var/lib/pgdata/data
chown -R postgres:postgres /var/lib/pgdata/logs 

Add the data directory information to the postgres service file

vim /etc/systemd/system/postgresql-11.service

.include /usr/lib/systemd/system/postgresql-11.service
# Location of database directory

The modified settings wouldn’t be effective unless we reload the service file after making the data directory changes

systemctl daemon-reload

Database Initializing

Below steps initializes the databases that is when system tables/files and meta data would get created.

/usr/pgsql-11/bin/postgresql-11-setup initdb
systemctl enable postgresql-11.service
systemctl start postgresql-11

Postgres.conf Configuration

There are basically 2 important configuration files in PostgreSQL (a) postgres.conf (b) pg_hba.conf

postgres.conf – contains the memory , replication , log based and database server level configuration settings

pg_hba.conf – is config which is used for providing Host Based Access in postgres


listen_addresses = '*'
max_connections = 500
shared_buffers = 4GB #### should be less than 25% of total memory
effective_cache_size=3GB #### should be 50% of total memory would be a normal settings 

wal_keep_segments = 50
log_rotation_age = 30d

log_directory = '/var/lib/pgdata/logs'
archive_mode = on
archive_command = 'test ! -f /var/lib/pgdata/data/pg_wal/archive_status/%f && cp %p /var/lib/pgdata/data/pg_wal/archive_status/%f'

NOTE : Restart Postgres or reload the postgres service to reflect the config changes done in Postgres.conf. Some changes are dynamic and doesn’t need postgres and a simple postgres service or demon reload is sufficient to adopt the latest settings however certain settings are static and would need PostgreSQL service restart.

Replication Configuration

This is the most important part of the setup and if not done correctly all the nodes will be standalone node and won’t be in a 3 node architecture. Lets split the configuration into two parts

  • Master side configuration
  • Slave side confirguration

Master side configuration

Create a database user that would be used for replication for sending and receiving the WAL logs

create role replication_user with replication password '*giveastrongpassword*' login;
create user replication_user with replication password '*giveastrongpassword*';

update the pg_hba.conf file to allow the slave servers to be able to connect to master node. if not done correctly slave wouldn’t be able to connect to primary and be able to start replication.

pg_hba.conf settings

host replication replication_user md5
host replication replication_user md5
host replication replication_user md5

## Reload PostgreSQL Daemon after the config changes. 

service postgresql-11 reload 

Slave side configuration

One both Standby 1 and 2 perform the following operation to clean the existing standalone data and sync it from master from scratch. We will use pg_basebackup utility for the backup process to sync the data.

systemctl stop postgresql-11
cd /var/lib/pgdata/data/
rm -rf *
pg_basebackup --wal-method=stream -D /var/lib/pgdata/data -U replication_user -h
chown -R postgres:postgres /var/lib/pgdata/data/

Create Recovery config

create a recovery file as below,

vim recovery.conf

standby_mode = 'on'
primary_conninfo = 'host= port=5432 user=replication_user password=xxxxxx'
trigger_file = '/tmp/postgresql.trigger'

change the ownership of recovery config file and restart postgres service

chown -R postgres:postgres recovery.conf
systemctl start postgresql-11.service
ps -efa | grep postgres

Validate the Replication Setup

create a test database/table on Master ( ) and you should be able to see the database in slave nodes if the replication is working fine.

postgres=# create database test;

With this we have completed setting up Postgres Replication similar to MySQL replication which we shared in other blog post . Choosing the right technology for the right dataset has become a very crucial part than ever before. There is an awesome blog that clearly explains the critical comparison between MySQL and PostgreSQL Reason being the data has been growing in such a massive pace by the time people realize they should have picked up one over the other the data gets into GBs or TBs and from that point of time keeping the business up and running becomes at important them migrating to the appropriate data structure. Please reach out to us @ if you need help with improving your database systems. We support a wide range of database and data streaming softwares visit us to learn more about us.