PostgreSQL DBA part 6 :-PostgreSQL high availability setup using EFM

PostgreSQL high availability setup using EFM

 

A.Introduction

This document provides you the overview of High Availability Setup for EDB Postgres 9.5 database.

An EDB Postgres Failover Manager (EFM) cluster is comprised of Failover Manager Processes that reside on the following hosts on a network:

  1. A Master node – The Master node is the primary database server that is servicing database clients.
  2. One or more Standby nodes – A Standby node is a streaming replication server associated with the Master node.
  3. Witness Node (optional)- The Witness node confirms assertions of either the Master or a Standby in a fail over scenario. A cluster does not need a dedicated witness node if the cluster contains three or more nodes; if you do not have a third cluster member that is a database host, you can add a dedicated Witness node.

 

B.Architecture(For best practice)

The Architecture for this environment shown in the below picture. It has 2 application nodes (VM1 &VM2) in primary site which are connected to VIP of EDB .  Application nodes are running on JBOSS 6.4 application server and both of the servers are running in standalone mode, pointing to the same metadata deployed at the same time (Which is the VIP : 172.XX.XX.5:5444/edb).

For EDB it has two nodes (VM3 and VM4) as active and standby mode , which is pointed by VIP. VM3 and VM4 are connected in Sync mode. For DR site , two DB nodes are present which are VM5 and VM6 . Both VM5 and VM6 are connected in ASYNC mode with main DB node on primary site.

The below configuration is made on EDB Test instance and will consist of 1 Master server and 1 standby server. The standby will be in sync with the primary to ensure no data loss happens during failover.

But for our test case,We have simplified installation where We have one site in primary and read replica or standby site in secondary.The whole test was done in AWS.

Below are the host details where the installation will be made:

NODE IP ROLE
edb1tst 172.31.20.93 Primary
edb2tst 172.31.20.82 Standby

C.Install PostgreSQL in both primary and standby nodes using below :-

https://clouddba.co/postgresql-dba-part1-installation-of-postgresql-9-6-in-linux/

D.Install EFM 2.1 RPM which is package provided by EDB for HA setup

[root@edb1tst home]# rpm -ivh ./edb_tools_redhat_rhel-6-x86_64/efm21-2.1.1-1.rhel6.x86_64.rpm

warning: ./edb_tools_redhat_rhel-6-x86_64/efm21-2.1.1-1.rhel6.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 7e30651c: NOKEY

Preparing...                ########################################### [100%]

1:efm21                  ########################################### [100%]

E.Modify the /opt/edb/as9.6/data/pg_hba.conf  to include all the entries of Primary and standby nodes:

For Primary

# "local" is for Unix domain socket connections only
host all all 0.0.0.0/0 trust
local all all trust
# IPv4 local connections:
host edb all 172.31.20.93/32 trust
# IPv6 local connections:
##host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication enterprisedb trust
host prod all 172.31.20.93/32 trust
host prod all 172.31.20.82/32 trust
host replication enterprisedb 172.31.20.93/32 trust
host replication enterprisedb 172.31.20.82/32 trust

For Standby

# "local" is for Unix domain socket connections only
host all all 0.0.0.0/0 trust
local all all trust
# IPv4 local connections:
host edb all 172.31.20.82/32 trust
# IPv6 local connections:
##host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication enterprisedb trust
host prod all 172.31.20.93/32 trust
host prod all 172.31.20.82/32 trust
host replication enterprisedb 172.31.20.93/32 trust
host replication enterprisedb 172.31.20.82/32 trust

Please restart DB

pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status

F.Check and turn off the FIREWALL on the servers (both Primary and Standby)

 

G.Encrypt efm password on both nodes:For my case username is enterprisedb and password is enterprisedb

/usr/efm-2.1/bin/efm encrypt efm
This utility will generate an encrypted password for you to place in your
EFM cluster property file.
Please enter the password and hit enter:
Please enter the password again to confirm:
The encrypted password is: 2ae8d8d6fbfb742863990dc444e9434f
Please paste this into your cluster properties file.
        db.password.encrypted=2ae8d8d6fbfb742863990dc444e9434f

Troubleshoot if in case it fails:-

### Error encoutered

[root@edb2prod home]# efm encrypt efm
This utility will generate an encrypted password for you to place in your
EFM cluster property file.
Please enter the password and hit enter:
Please enter the password again to confirm:
Could not encode password: Illegal key size or default parameters
If full-strength encryption has not been installed in the current version of Java, search for “<vendor name> java full strength encryption” for more information.
### Solution
For this download “Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files 8” for JDK 1.8 from below link and copy that to /usr/java/jdk1.8.0_121/jre/lib/security location:
http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html
[root@edb2test security]# ll *jar*
-rw-r--r--. 1 root root 3035 Dec 20  2013 local_policy.jar
-rw-r--r--. 1 root root 3405 Apr 11 15:47 local_policy.jar_bkp
-rw-r--r--. 1 root root 3023 Dec 20  2013 US_export_policy.jar
-rw-r--r--. 1 root root 2920 Apr 11 15:47 US_export_policy.jar_bkp

[root@edb2test security]# efm encrypt efm
This utility will generate an encrypted password for you to place in your
EFM cluster property file.
Please enter the password and hit enter:
Please enter the password again to confirm:
The encrypted password is: 2ae8d8d6fbfb742863990dc444e9434f
Please paste this into your cluster properties file.
        db.password.encrypted=2ae8d8d6fbfb742863990dc444e9434f
H.STEP-BY-STEP IMPLEMENTATION PROCEDURE

Steps in PRIMARY server:

1. Add the below contents to pgpass

su – enterprisedb -c “echo \”*:*:*:enterprisedb:enterprisedb\” > ~/.pgpass”
su – enterprisedb -c “chmod 600 ~/.pgpass”

2.Set permissions and ownerships

 -bash-4.1$ chown enterprisedb:enterprisedb data

-bash-4.1$ chmod -R 700 data

3.Check the status of pg_ctl process as below:

 

-bash-4.1$ /opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data status

pg_ctl: server is running (PID: 683)

/opt/edb/as9.6/bin/edb-postgres "-D" "/opt/edb/as9.6/data"

NOTE: Here in this installation we already had a single node instance running on the system, and application connectivity was established. Since we’re only changing the single installation to HA hence we just checked the above status.

If this is a fresh installation then execute the below command:

 [root@rac4 as9.6]# su - enterprisedb -c "/opt/edb/as9.6/bin/initdb -D /opt/edb/as9.6/data"
 And then check the status:

/opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data start

4. Modify the efm.nodes to include the standby server entries:

 5.Modify the bash profile as below and run encrypt the EFM password as below:

6.Export the below variables on the terminal as root user and set the password for enterprisedb user

 

export YUM_USER=XXXXX
export YUM_PASSWORD=test#123
export MASTER_IP=172.31.20.93
export STDBY_IP=172.31.20.82
export YOUR_EMAIL=XX.XX@gmail.com
export PWD_ENCRYPT=2ae8d8d6fbfb742863990dc444e9434f
export PGDATA=/opt/edb/as9.6/data

 

su - enterprisedb -c "/opt/edb/as9.6/bin/psql -p 5444 -d edb -c \"alter user enterprisedb identified by enterprisedb\""

 7.Add the below entries at the end of conf file

vi /opt/edb/as9.6/data/postgresql.conf

 

wal_level = hot_standby
max_wal_senders = 3
hot_standby = on
archive_mode = on
archive_command = 'cd .'
logging_collector = on    (Check is it already on or not)

8.Restart the database again

pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status

9.Add the following entries in efm.properties file:

 

sed -i "s/db.user=/db.user=enterprisedb/" /etc/efm-2.1/efm.properties
sed -i "s/db.password.encrypted=/db.password.encrypted=$PWD_ENCRYPT/" /etc/efm-2.1/efm.properties
sed -i "s/db.port=/db.port=5444/" /etc/efm-2.1/efm.properties
sed -i "s/db.database=/db.database=edb/" /etc/efm-2.1/efm.properties
sed -i "s/admin.port=/admin.port=7809/" /etc/efm-2.1/efm.properties
sed -i "s/user.email=/user.email=$YOUR_EMAIL/" /etc/efm-2.1/efm.properties
sed -i "s/bind.address=/bind.address=$MASTER_IP:7800/" /etc/efm-2.1/efm.properties
sed -i "s/is.witness=/is.witness=false/" /etc/efm-2.1/efm.properties
sed -i "s/db.service.owner=/db.service.owner=enterprisedb/" /etc/efm-2.1/efm.properties
sed -i "s;db.recovery.conf.dir=;db.recovery.conf.dir=/opt/edb/as9.6/data;g" /etc/efm-2.1/efm.properties
sed -i "s/db.bin=/db.bin=\/opt\/edb\/as9.6\/bin/" /etc/efm-2.1/efm.properties
sed -i "s/auto.allow.hosts=false/auto.allow.hosts=true/" /etc/efm-2.1/efm.properties

 10.Execute the below command to check whether we’re getting the edb prompt:

 

NOTE: We received the above highlighted error during starting the services. For this edit the pingServerIp property in efm.properties file to the local hostname: pingServerIp=172.19.18.95 (this is because the server is not having internet connectivity presently)

11.Restart the pg_ctl services

pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status

NOTE: We have got errors in the mentioned file:

Resolution:

vi “/opt/edb/as9.6/data/postgresql.conf”

logging_collector = on    #(Check is it already on or not) (**comment it as highlighted)

Steps in STANDBY server:

 

1.Export the variables below in terminal as root user:

export YUM_USER=XXXXX
export YUM_PASSWORD=test#123
export MASTER_IP=172.31.20.93
export STDBY_IP=172.31.20.82
export YOUR_EMAIL=XX.XX@gmail.com
export PWD_ENCRYPT=2ae8d8d6fbfb742863990dc444e9434f
export PGDATA=/opt/edb/as9.6/data

 2.Modify the ownership and permission same as primary

 su - enterprisedb -c "echo \"*:*:*:enterprisedb:enterprisedb\" > ~/.pgpass"
 su - enterprisedb -c "chmod 600 ~/.pgpass"

 3.Take backup of data directory if present earlier:

Then execute the below commands:

[root@edb2tst as9.6]# mv data_bkp data

[root@edb2tst as9.6]# mkdir data

[root@edb2tst as9.6]# chown enterprisedb:enterprisedb data

[root@edb2tst as9.6]# chmod 0700 data

[root@edb2tst as9.6]# su - enterprisedb -c "/opt/edb/as9.6/bin/pg_basebackup -h 172.31.20.93 -p 5444 -R -D /opt/edb/as9.6/data"

NOTICE:  pg_stop_backup complete, all required WAL segments have been archived

4.Start Server

[root@edb2tst data]# su – enterprisedb -c ” /opt/edb/as9.6/bin/pg_ctl -w -D /opt/edb/as9.6/data start”

5.Add the following entries in conf file:

[root@edb2tst data]# echo trigger_file=\'/opt/edb/as9.6/trigger.file\' >> /opt/edb/as9.6/recovery.conf

[root@edb2tst data]# echo recovery_target_timeline='latest' >>/opt/edb/as9.6/recovery.conf

 6.copy efm.proporties file from node 1(primary) to standby

 7.Edit the bind.address to the below entry on standby

bind.address=172.31.20.82:7800

Add the below entry to efm.nodes file

172.31.20.93:7800

I.Final steps to start EFM and check .These steps needs to be done both nodes

 

1.Edit both the efm.nodes and put the IP of all nodes in the cluster:

In primary

vi /etc/efm-2.1/efm.nodes

172.31.20.93:7800
172.31.20.82:7800

 In Standby

vi /etc/efm-2.1/efm.nodes

172.31.20.93:7800
172.31.20.82:7800

2.Stop/Start both the DB and efm service

su - enterprisedb -c "/opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data  stop"

su - enterprisedb -c "/opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data  stop"

3.Stop and start EFM services

service efm-2.1 stop

service efm-2.1 start
4.Now you can check cluster status
[root@ip-172-31-20-93 ~]# /usr/efm-2.1/bin/efm cluster-status efm
Cluster Status: efm
VIP:

Agent Type Address Agent DB Info
--------------------------------------------------------------
Standby 172.31.20.82 UP UP
Master 172.31.20.93 UP UP

Allowed node host list:
172.31.20.93 172.31.20.82

Membership coordinator: 172.31.20.82

Standby priority host list:
172.31.20.82

Promote Status:

DB Type Address XLog Loc Info
--------------------------------------------------------------
Master 172.31.20.93 0/700DDE0
Standby 172.31.20.82 0/700DDE0

Standby database(s) in sync with master. It is safe to promote.

5.How to establish SYNC 

For establishing SYNCHRONOUS (ie SYNC) connectivity :

Modify the below parameter in the below file:

In primary:

-bash-4.1$ vi postgresql.conf

-bash-4.1$ pwd

/opt/edb/as9.6/data 

 

In standby:

-bash-4.1$ vi recovery.conf

standby_mode = 'on'

primary_conninfo = 'user=enterprisedb password=enterprisedb host=172.19.18.95 port=5444 sslmode=prefer sslcompression=1 krbsrvname=postgres application_name = eocapp'

trigger_file='/opt/edb/as9.6/data/trigger.file' recovery_target_timeline=latest