PostgreSQL is another most important open source database besides MySQL in RDBMS world. In this blog we have tried to cover a 3 node PostgreSQL replication setup in Centos 7. Majority of the steps would just remain the same for Ubuntu as well except for the yum based steps that has to be converted appropriately to apt-get based statements and rest of the database level configuration and steps would remain the same.
Our goal from this documentation is to enable you to setup a 3 node PostgreSQL cluster using streaming replication.

Create PostgreSQL Repository
Create the repository for the version of postgres you would like to install the majority of the procedure would remain the same with few additional steps in later versions however the fundamentals for the setup remains the same.
cat > /etc/yum.repos.d/pgdg11.repo << 'EOF'
[pgdg11]
name=PostgreSQL 11 $releasever - $basearch
baseurl=http://yum.teslamotors.com/postgresql/11/redhat/rhel-$releasever-$basearch
enabled=1
gpgcheck=0
EOF
cat > /etc/yum.repos.d/pgdg12.repo << 'EOF'
[pgdg12]
name=PostgreSQL 12 $releasever - $basearch
baseurl=http://yum.teslamotors.com/postgresql/12/redhat/rhel-$releasever-$basearch
enabled=1
gpgcheck=0
EOF
cat > /etc/yum.repos.d/pgdg13.repo << 'EOF'
[pgdg13]
name=PostgreSQL 13 $releasever - $basearch
baseurl=https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-$releasever-$basearch
enabled=1
gpgcheck=0
EOF
Install Postgres
Below command would install the latest stable version in the repo you would configure if you added all the three repos you would have to give the version of postgres you would like to install
yum -y install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum -y install -y postgresql**-server
Configure Data Directory
Create the data directory, and change the ownership to postgres user
mkdir -p /var/lib/pgdata/data
chown -R postgres:postgres /var/lib/pgdata/data
chown -R postgres:postgres /var/lib/pgdata/logs
Add the data directory information to the postgres service file
vim /etc/systemd/system/postgresql-11.service
.include /usr/lib/systemd/system/postgresql-11.service
# Location of database directory [Service] Environment=PGDATA=/var/lib/pgdata/data
The modified settings wouldn’t be effective unless we reload the service file after making the data directory changes
systemctl daemon-reload
Database Initializing
Below steps initializes the databases that is when system tables/files and meta data would get created.
/usr/pgsql-11/bin/postgresql-11-setup initdb
systemctl enable postgresql-11.service
systemctl start postgresql-11
Postgres.conf Configuration
There are basically 2 important configuration files in PostgreSQL (a) postgres.conf (b) pg_hba.conf
postgres.conf – contains the memory , replication , log based and database server level configuration settings
pg_hba.conf – is config which is used for providing Host Based Access in postgres
postgresql.conf
listen_addresses = '*'
max_connections = 500
shared_buffers = 4GB #### should be less than 25% of total memory
effective_cache_size=3GB #### should be 50% of total memory would be a normal settings
wal_keep_segments = 50
log_rotation_age = 30d
log_directory = '/var/lib/pgdata/logs'
archive_mode = on
archive_command = 'test ! -f /var/lib/pgdata/data/pg_wal/archive_status/%f && cp %p /var/lib/pgdata/data/pg_wal/archive_status/%f'
NOTE : Restart Postgres or reload the postgres service to reflect the config changes done in Postgres.conf. Some changes are dynamic and doesn’t need postgres and a simple postgres service or demon reload is sufficient to adopt the latest settings however certain settings are static and would need PostgreSQL service restart.
Replication Configuration
This is the most important part of the setup and if not done correctly all the nodes will be standalone node and won’t be in a 3 node architecture. Lets split the configuration into two parts
- Master side configuration
- Slave side confirguration
Master side configuration
Create a database user that would be used for replication for sending and receiving the WAL logs
create role replication_user with replication password '*giveastrongpassword*' login;
or
create user replication_user with replication password '*giveastrongpassword*';
update the pg_hba.conf file to allow the slave servers to be able to connect to master node. if not done correctly slave wouldn’t be able to connect to primary and be able to start replication.
pg_hba.conf settings
host replication replication_user 192.72.0.3/32 md5
host replication replication_user 192.72.0.4/32 md5
host replication replication_user 192.72.0.5/32 md5
## Reload PostgreSQL Daemon after the config changes.
service postgresql-11 reload
Slave side configuration
One both Standby 1 and 2 perform the following operation to clean the existing standalone data and sync it from master from scratch. We will use pg_basebackup utility for the backup process to sync the data.
systemctl stop postgresql-11
cd /var/lib/pgdata/data/
rm -rf *
pg_basebackup --wal-method=stream -D /var/lib/pgdata/data -U replication_user -h 192.72.0.3
chown -R postgres:postgres /var/lib/pgdata/data/
Create Recovery config
create a recovery file as below,
vim recovery.conf
===========
standby_mode = 'on'
primary_conninfo = 'host=192.72.0.3 port=5432 user=replication_user password=xxxxxx'
trigger_file = '/tmp/postgresql.trigger'
===========
change the ownership of recovery config file and restart postgres service
chown -R postgres:postgres recovery.conf
systemctl start postgresql-11.service
ps -efa | grep postgres
Validate the Replication Setup
create a test database/table on Master ( 192.72.0.3 ) and you should be able to see the database in slave nodes if the replication is working fine.
postgres=# create database test;
With this we have completed setting up Postgres Replication similar to MySQL replication which we shared in other blog post . Choosing the right technology for the right dataset has become a very crucial part than ever before. There is an awesome blog that clearly explains the critical comparison between MySQL and PostgreSQL Reason being the data has been growing in such a massive pace by the time people realize they should have picked up one over the other the data gets into GBs or TBs and from that point of time keeping the business up and running becomes at important them migrating to the appropriate data structure. Please reach out to us @ support@genexdbs.com if you need help with improving your database systems. We support a wide range of database and data streaming softwares visit us genexdbs.com to learn more about us.