PostgreSQL high availability setup using EFM
A.Introduction
This document provides you the overview of High Availability Setup for EDB Postgres 9.5 database.
An EDB Postgres Failover Manager (EFM) cluster is comprised of Failover Manager Processes that reside on the following hosts on a network:
- A Master node – The Master node is the primary database server that is servicing database clients.
- One or more Standby nodes – A Standby node is a streaming replication server associated with the Master node.
- Witness Node (optional)- The Witness node confirms assertions of either the Master or a Standby in a fail over scenario. A cluster does not need a dedicated witness node if the cluster contains three or more nodes; if you do not have a third cluster member that is a database host, you can add a dedicated Witness node.
B.Architecture(For best practice)
The Architecture for this environment shown in the below picture. It has 2 application nodes (VM1 &VM2) in primary site which are connected to VIP of EDB . Application nodes are running on JBOSS 6.4 application server and both of the servers are running in standalone mode, pointing to the same metadata deployed at the same time (Which is the VIP : 172.XX.XX.5:5444/edb).
For EDB it has two nodes (VM3 and VM4) as active and standby mode , which is pointed by VIP. VM3 and VM4 are connected in Sync mode. For DR site , two DB nodes are present which are VM5 and VM6 . Both VM5 and VM6 are connected in ASYNC mode with main DB node on primary site.
The below configuration is made on EDB Test instance and will consist of 1 Master server and 1 standby server. The standby will be in sync with the primary to ensure no data loss happens during failover.
But for our test case,We have simplified installation where We have one site in primary and read replica or standby site in secondary.The whole test was done in AWS.
Below are the host details where the installation will be made:
NODE | IP | ROLE |
edb1tst | 172.31.20.93 | Primary |
edb2tst | 172.31.20.82 | Standby |
C.Install PostgreSQL in both primary and standby nodes using below :-
https://clouddba.co/postgresql-dba-part1-installation-of-postgresql-9-6-in-linux/
D.Install EFM 2.1 RPM which is package provided by EDB for HA setup
[root@edb1tst home]# rpm -ivh ./edb_tools_redhat_rhel-6-x86_64/efm21-2.1.1-1.rhel6.x86_64.rpm warning: ./edb_tools_redhat_rhel-6-x86_64/efm21-2.1.1-1.rhel6.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 7e30651c: NOKEY Preparing... ########################################### [100%] 1:efm21 ########################################### [100%]
E.Modify the /opt/edb/as9.6/data/pg_hba.conf to include all the entries of Primary and standby nodes:
For Primary
# "local" is for Unix domain socket connections only host all all 0.0.0.0/0 trust local all all trust # IPv4 local connections: host edb all 172.31.20.93/32 trust # IPv6 local connections: ##host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. local replication enterprisedb trust host prod all 172.31.20.93/32 trust host prod all 172.31.20.82/32 trust host replication enterprisedb 172.31.20.93/32 trust host replication enterprisedb 172.31.20.82/32 trust
For Standby
# "local" is for Unix domain socket connections only host all all 0.0.0.0/0 trust local all all trust # IPv4 local connections: host edb all 172.31.20.82/32 trust # IPv6 local connections: ##host all all ::1/128 trust # Allow replication connections from localhost, by a user with the # replication privilege. local replication enterprisedb trust host prod all 172.31.20.93/32 trust host prod all 172.31.20.82/32 trust host replication enterprisedb 172.31.20.93/32 trust host replication enterprisedb 172.31.20.82/32 trust
Please restart DB
pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status
F.Check and turn off the FIREWALL on the servers (both Primary and Standby)
G.Encrypt efm password on both nodes:For my case username is enterprisedb and password is enterprisedb
/usr/efm-2.1/bin/efm encrypt efm This utility will generate an encrypted password for you to place in your EFM cluster property file. Please enter the password and hit enter: Please enter the password again to confirm: The encrypted password is: 2ae8d8d6fbfb742863990dc444e9434f Please paste this into your cluster properties file. db.password.encrypted=2ae8d8d6fbfb742863990dc444e9434f
Troubleshoot if in case it fails:-
### Error encoutered
[root@edb2prod home]# efm encrypt efm This utility will generate an encrypted password for you to place in your EFM cluster property file. Please enter the password and hit enter: Please enter the password again to confirm: Could not encode password: Illegal key size or default parameters
[root@edb2test security]# ll *jar* -rw-r--r--. 1 root root 3035 Dec 20 2013 local_policy.jar -rw-r--r--. 1 root root 3405 Apr 11 15:47 local_policy.jar_bkp -rw-r--r--. 1 root root 3023 Dec 20 2013 US_export_policy.jar -rw-r--r--. 1 root root 2920 Apr 11 15:47 US_export_policy.jar_bkp [root@edb2test security]# efm encrypt efm This utility will generate an encrypted password for you to place in your EFM cluster property file. Please enter the password and hit enter: Please enter the password again to confirm: The encrypted password is: 2ae8d8d6fbfb742863990dc444e9434f Please paste this into your cluster properties file. db.password.encrypted=2ae8d8d6fbfb742863990dc444e9434f
Steps in PRIMARY server:
1. Add the below contents to pgpass
su – enterprisedb -c “echo \”*:*:*:enterprisedb:enterprisedb\” > ~/.pgpass”
su – enterprisedb -c “chmod 600 ~/.pgpass”
2.Set permissions and ownerships
-bash-4.1$ chown enterprisedb:enterprisedb data -bash-4.1$ chmod -R 700 data
3.Check the status of pg_ctl process as below:
-bash-4.1$ /opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data status pg_ctl: server is running (PID: 683) /opt/edb/as9.6/bin/edb-postgres "-D" "/opt/edb/as9.6/data"
NOTE: Here in this installation we already had a single node instance running on the system, and application connectivity was established. Since we’re only changing the single installation to HA hence we just checked the above status.
If this is a fresh installation then execute the below command:
[root@rac4 as9.6]# su - enterprisedb -c "/opt/edb/as9.6/bin/initdb -D /opt/edb/as9.6/data" And then check the status: /opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data start
4. Modify the efm.nodes to include the standby server entries:
5.Modify the bash profile as below and run encrypt the EFM password as below:
6.Export the below variables on the terminal as root user and set the password for enterprisedb user
export YUM_USER=XXXXX export YUM_PASSWORD=test#123 export MASTER_IP=172.31.20.93 export STDBY_IP=172.31.20.82 export YOUR_EMAIL=XX.XX@gmail.com export PWD_ENCRYPT=2ae8d8d6fbfb742863990dc444e9434f export PGDATA=/opt/edb/as9.6/data
su - enterprisedb -c "/opt/edb/as9.6/bin/psql -p 5444 -d edb -c \"alter user enterprisedb identified by enterprisedb\""
7.Add the below entries at the end of conf file
vi /opt/edb/as9.6/data/postgresql.conf
wal_level = hot_standby max_wal_senders = 3 hot_standby = on archive_mode = on archive_command = 'cd .' logging_collector = on (Check is it already on or not)
8.Restart the database again
pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status
9.Add the following entries in efm.properties file:
sed -i "s/db.user=/db.user=enterprisedb/" /etc/efm-2.1/efm.properties sed -i "s/db.password.encrypted=/db.password.encrypted=$PWD_ENCRYPT/" /etc/efm-2.1/efm.properties sed -i "s/db.port=/db.port=5444/" /etc/efm-2.1/efm.properties sed -i "s/db.database=/db.database=edb/" /etc/efm-2.1/efm.properties sed -i "s/admin.port=/admin.port=7809/" /etc/efm-2.1/efm.properties sed -i "s/user.email=/user.email=$YOUR_EMAIL/" /etc/efm-2.1/efm.properties sed -i "s/bind.address=/bind.address=$MASTER_IP:7800/" /etc/efm-2.1/efm.properties sed -i "s/is.witness=/is.witness=false/" /etc/efm-2.1/efm.properties sed -i "s/db.service.owner=/db.service.owner=enterprisedb/" /etc/efm-2.1/efm.properties sed -i "s;db.recovery.conf.dir=;db.recovery.conf.dir=/opt/edb/as9.6/data;g" /etc/efm-2.1/efm.properties sed -i "s/db.bin=/db.bin=\/opt\/edb\/as9.6\/bin/" /etc/efm-2.1/efm.properties sed -i "s/auto.allow.hosts=false/auto.allow.hosts=true/" /etc/efm-2.1/efm.properties
10.Execute the below command to check whether we’re getting the edb prompt:
NOTE: We received the above highlighted error during starting the services. For this edit the pingServerIp property in efm.properties file to the local hostname: pingServerIp=172.19.18.95 (this is because the server is not having internet connectivity presently)
11.Restart the pg_ctl services
pg_ctl -D /opt/edb/as9.6/data stop
pg_ctl -D /opt/edb/as9.6/data start
pg_ctl -D /opt/edb/as9.6/data status
NOTE: We have got errors in the mentioned file:
Resolution:
vi “/opt/edb/as9.6/data/postgresql.conf”
logging_collector = on #(Check is it already on or not) (**comment it as highlighted)
Steps in STANDBY server:
1.Export the variables below in terminal as root user:
export YUM_USER=XXXXX export YUM_PASSWORD=test#123 export MASTER_IP=172.31.20.93 export STDBY_IP=172.31.20.82 export YOUR_EMAIL=XX.XX@gmail.com export PWD_ENCRYPT=2ae8d8d6fbfb742863990dc444e9434f export PGDATA=/opt/edb/as9.6/data
2.Modify the ownership and permission same as primary
su - enterprisedb -c "echo \"*:*:*:enterprisedb:enterprisedb\" > ~/.pgpass" su - enterprisedb -c "chmod 600 ~/.pgpass"
3.Take backup of data directory if present earlier:
Then execute the below commands:
[root@edb2tst as9.6]# mv data_bkp data [root@edb2tst as9.6]# mkdir data [root@edb2tst as9.6]# chown enterprisedb:enterprisedb data [root@edb2tst as9.6]# chmod 0700 data [root@edb2tst as9.6]# su - enterprisedb -c "/opt/edb/as9.6/bin/pg_basebackup -h 172.31.20.93 -p 5444 -R -D /opt/edb/as9.6/data" NOTICE: pg_stop_backup complete, all required WAL segments have been archived
4.Start Server
[root@edb2tst data]# su – enterprisedb -c ” /opt/edb/as9.6/bin/pg_ctl -w -D /opt/edb/as9.6/data start”
5.Add the following entries in conf file:
[root@edb2tst data]# echo trigger_file=\'/opt/edb/as9.6/trigger.file\' >> /opt/edb/as9.6/recovery.conf [root@edb2tst data]# echo recovery_target_timeline='latest' >>/opt/edb/as9.6/recovery.conf
6.copy efm.proporties file from node 1(primary) to standby
7.Edit the bind.address to the below entry on standby
bind.address=172.31.20.82:7800 Add the below entry to efm.nodes file 172.31.20.93:7800
I.Final steps to start EFM and check .These steps needs to be done both nodes
1.Edit both the efm.nodes and put the IP of all nodes in the cluster:
In primary
vi /etc/efm-2.1/efm.nodes
172.31.20.93:7800
172.31.20.82:7800
In Standby
vi /etc/efm-2.1/efm.nodes
172.31.20.93:7800
172.31.20.82:7800
2.Stop/Start both the DB and efm service
su - enterprisedb -c "/opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data stop" su - enterprisedb -c "/opt/edb/as9.6/bin/pg_ctl -D /opt/edb/as9.6/data stop"
3.Stop and start EFM services
service efm-2.1 stop service efm-2.1 start
[root@ip-172-31-20-93 ~]# /usr/efm-2.1/bin/efm cluster-status efm Cluster Status: efm VIP: Agent Type Address Agent DB Info -------------------------------------------------------------- Standby 172.31.20.82 UP UP Master 172.31.20.93 UP UP Allowed node host list: 172.31.20.93 172.31.20.82 Membership coordinator: 172.31.20.82 Standby priority host list: 172.31.20.82 Promote Status: DB Type Address XLog Loc Info -------------------------------------------------------------- Master 172.31.20.93 0/700DDE0 Standby 172.31.20.82 0/700DDE0 Standby database(s) in sync with master. It is safe to promote.
5.How to establish SYNC
For establishing SYNCHRONOUS (ie SYNC) connectivity :
Modify the below parameter in the below file:
In primary:
-bash-4.1$ vi postgresql.conf -bash-4.1$ pwd
/opt/edb/as9.6/data
In standby:
-bash-4.1$ vi recovery.conf standby_mode = 'on' primary_conninfo = 'user=enterprisedb password=enterprisedb host=172.19.18.95 port=5444 sslmode=prefer sslcompression=1 krbsrvname=postgres application_name = eocapp' trigger_file='/opt/edb/as9.6/data/trigger.file' recovery_target_timeline=latest