Revision history
Rev. | Resp | Date | Note. |
PA1 | 07.10.2015 | Initial Version |
1 Environment
1.1 Hardware
In below table you can find information about hardware on which Oracle components will be installed:
-
Hardware details
Parameter | Value | Value | Value | Value |
Hostname | Kolkata1 | Kolkata2 | Delhi1 | Delhi2 |
Server type | Physical | Physical | Physical | Physical |
Role | primary | primary | standby | standby |
Physical ip | 10.51.11.7 | 10.51.11.8 | 10.51.11.36 | 10.51.11.37 |
Dataguard physical ip | 10.51.20.69 | 10.51.20.70 | 10.75.20.101 | 10.75.20.102 |
Instance name | ORCLPD11 | ORCLPD12 | ORCLDR11 | ORCLDR12 |
1.2 Software
In below table you can find information about software installed:
-
Hardware details
Parametr | Value | Value | Value | |
Hostname | Kolkata1 | Kolkata2 | Delhi1 | Delhi2 |
Operating system | RHEL 7.2 | RHEL 7.2 | RHEL 7.2 | RHEL 7.2 |
Oracle | 12.1.0.2.0 | 12.1.0.2.0 | 12.1.0.2.0 | 12.1.0.2.0 |
1.3 Networking
(This step is only required if you have separate replication network for DR traffic.It is not necessary if there is no separate DR replication network)
1.3.1 Primary Site
Network in Primaryfor DR(primary database) |
su – root
. oraenv +ASM1 oifcfg iflist -p -n 10.51.20.64 is the subnet IP which we got from above command. 10.51.20.71 and 10.51.20.72 are VIP address. srvctl add network -k 2 -S 10.51.20.64/255.255.255.240/bond3.161 -w static -v srvctl add vip -n Kolkata1 -A 10.51.20.71/255.255.255.240/bond3.161 -k 2 crsctl status res -t srvctl start vip -n Kolkata1
|
1.3.2 Disaster recovery Site
Network in Secondary for DR(standby database) |
su – root
. oraenv +ASM1 oifcfg iflist -p -n 10.51.20.96 is the subnet IP which we got from above command. 10.51.20.103 and 10.51.20.104 are VIP address. srvctl add network -k 2 -S 10.75.20.96/255.255.255.240/bond3.161 -w static -v srvctl add vip -n Delhi1 -A 10.75.20.103/255.255.255.240/bond3.161 -k 2 crsctl status res -t srvctl start vip -n Delhi1
|
Please use netca to create listener LISTENER_DG dedicated to handle DR traffic
1.4.1 Listener Configuration Kolkata1
Oracle Network Configuration Kolkata1 |
SID_LIST_LISTENER_DG =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = ORCLPD1.tdeprdcl.internal)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1)
(SID_NAME = ORCLPD11)
)
)
[grid@Kolkata1 admin]$ lsnrctl start LISTENER_DG LSNRCTL for Linux: Version 12.1.0.2.0 – Production on 28-MAR-2017 09:13:21
Copyright (c) 1991, 2014, Oracle. All rights reserved.
Starting /u01/app/product/12.1.0/grid/bin/tnslsnr: please wait…
TNSLSNR for Linux: Version 12.1.0.2.0 – Production
System parameter file is /u01/app/product/12.1.0/grid/network/admin/listener.ora
Log messages written to /u01/app/grid/diag/tnslsnr/Kolkata1/listener_dg/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
————————
Alias LISTENER_DG
Version TNSLSNR for Linux: Version 12.1.0.2.0 – Production
Start Date 28-MAR-2017 09:13:21
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/product/12.1.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/Kolkata1/listener_dg/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Services Summary…
Service “ORCLPD1.tdeprdcl.internal” has 1 instance(s).
Instance “ORCLPD11”, status UNKNOWN, has 1 handler(s) for this service…
The command completed successfully
|
1.4.2 Listener Configuration Kolkata2
Oracle Network Configuration Kolkata2 |
SID_LIST_LISTENER_DG =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = ORCLPD1.tdeprdcl.internal)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1)
(SID_NAME = ORCLPD12)
)
)
[grid@Kolkata2 ~]$ lsnrctl start LISTENER_DGLSNRCTL for Linux: Version 12.1.0.2.0 – Production on 28-MAR-2017 09:15:24
Copyright (c) 1991, 2014, Oracle. All rights reserved.
Starting /u01/app/product/12.1.0/grid/bin/tnslsnr: please wait…
TNSLSNR for Linux: Version 12.1.0.2.0 – Production
System parameter file is /u01/app/product/12.1.0/grid/network/admin/listener.ora
Log messages written to /u01/app/grid/diag/tnslsnr/Kolkata2/listener_dg/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
————————
Alias LISTENER_DG
Version TNSLSNR for Linux: Version 12.1.0.2.0 – Production
Start Date 28-MAR-2017 09:15:24
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/product/12.1.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/Kolkata2/listener_dg/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Services Summary…
Service “ORCLPD1.tdeprdcl.internal” has 1 instance(s).
Instance “ORCLPD12”, status UNKNOWN, has 1 handler(s) for this service…
The command completed successfully
|
1.4.3 Listener Configuration Delhi1
Oracle Network Configuration Delhi1 |
SID_LIST_LISTENER_DG =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = ORCLDR1.tdeprdcl.internal)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1)
(SID_NAME = ORCLDR11)
)
)
[grid@Delhi1 admin]$ lsnrctl start LISTENER_DGLSNRCTL for Linux: Version 12.1.0.2.0 – Production on 28-MAR-2017 09:20:34
Copyright (c) 1991, 2014, Oracle. All rights reserved.
Starting /u01/app/product/12.1.0/grid/bin/tnslsnr: please wait…
TNSLSNR for Linux: Version 12.1.0.2.0 – Production
System parameter file is /u01/app/product/12.1.0/grid/network/admin/listener.ora
Log messages written to /u01/app/grid/diag/tnslsnr/Delhi1/listener_dg/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
————————
Alias LISTENER_DG
Version TNSLSNR for Linux: Version 12.1.0.2.0 – Production
Start Date 28-MAR-2017 09:20:34
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/product/12.1.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/Delhi1/listener_dg/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Services Summary…
Service “ORCLDR1.tdeprdcl.internal” has 1 instance(s).
Instance “ORCLDR11”, status UNKNOWN, has 1 handler(s) for this service…
The command completed successfully
|
1.4.4 Listener Configuration Delhi2
Oracle Network Configuration Delhi1 |
SID_LIST_LISTENER_DG =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = ORCLDR1.tdeprdcl.internal)
(ORACLE_HOME = /u01/app/oracle/product/12.1.0/db_1)
(SID_NAME = ORCLDR12)
)
)
[grid@Delhi2 ~]$ lsnrctl start LISTENER_DGLSNRCTL for Linux: Version 12.1.0.2.0 – Production on 28-MAR-2017 09:23:40
Copyright (c) 1991, 2014, Oracle. All rights reserved.
Starting /u01/app/product/12.1.0/grid/bin/tnslsnr: please wait…
TNSLSNR for Linux: Version 12.1.0.2.0 – Production
System parameter file is /u01/app/product/12.1.0/grid/network/admin/listener.ora
Log messages written to /u01/app/grid/diag/tnslsnr/Delhi2/listener_dg/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_DG)))
STATUS of the LISTENER
————————
Alias LISTENER_DG
Version TNSLSNR for Linux: Version 12.1.0.2.0 – Production
Start Date 28-MAR-2017 09:23:40
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/product/12.1.0/grid/network/admin/listener.ora
Listener Log File /u01/app/grid/diag/tnslsnr/Delhi2/listener_dg/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_DG)))
Services Summary…
Service “ORCLDR1.tdeprdcl.internal” has 1 instance(s).
Instance “ORCLDR12”, status UNKNOWN, has 1 handler(s) for this service…
The command completed successfully
|
1.5 TNS Configuration
1.5.1 TNS Configuration for kolkata1 and kolkata2
Configure /u01/app/oracle/product/12.1.0/db_1/network/admin/tnsnames.ora as below
Oracle TNS Configuration For Kolkata1 and Kolkata2 |
ORCLDR1 =
(description=
(load_balance=on)
(address=(protocol=tcp)(host=10.75.20.103)(port=1592))
(address=(protocol=tcp)(host=10.75.20.104)(port=1592))
(connect_data=
(service_name=ORCLDR1.tdeprdcl.internal)))
ORCLPD1 =
(description=
(load_balance=on)
(address=(protocol=tcp)(host=10.51.20.71)(port=1592))
(address=(protocol=tcp)(host=10.51.20.72)(port=1592))
(connect_data=
(service_name=ORCLPD1.tdeprdcl.internal)))
|
1.5.2 TNS Configuration for delhi1 and delhi2
Oracle TNS Configuration For For delhi1 and delhi2 |
ORCLPD1 =
(description=
(load_balance=on)
(address=(protocol=tcp)(host=10.51.20.71)(port=1592))
(address=(protocol=tcp)(host=10.51.20.72)(port=1592))
(connect_data=
(service_name=ORCLPD1.tdeprdcl.internal)))
ORCLDR1 =
(description=
(load_balance=on)
(address=(protocol=tcp)(host=10.75.20.103)(port=1592))
(address=(protocol=tcp)(host=10.75.20.104)(port=1592))
(connect_data=
(service_name=ORCLDR1.tdeprdcl.internal)))
|
2 Dataguard Setup Detail
2.1 Architecture
On the orclprod primary database, Data Guard uses the following processes:
-
Log writer (LGWR) – it collects transaction redo information and updates the online redo In synchronous mode, it ships redo information directly to the remote file server (RFS) process on the physical standby database and waits for a confirmation before proceeding. In asynchronous mode, it ships the redo information directly but doesn’t wait before proceeding. In asynchronous mode, LGWR submits the network I/O request to the Log-write Network Server (LNSn) process for that destination
-
Archiver (ARCHn) – ARCH1 creates copy of the online redo logs locally for use in a primary database The ARCH2 process also ships the redo stream to the RFS process while simultaneously archiving the online log. ARCH2 is also responsible for proactively detecting and resolving gaps on the physical standby database.
-
Fetch archive log (FAL) – exists only on the physical standby database only: FAL provides a client/server mechanism for resolving gaps detected in the range of archived redo logs that are generated at the primary database and received at the standby This process is started only when needed and shuts down as soon as it is finished. It is very likely you will not see this process running.
Note: You can configure a primary database to ship redo information to a single standby database by using either LGWR or ARCn, but not both.
On the standby database bscsproddr, Data Guard uses the following processes:
-
Remote file server (RFS) – RFS receives redo information from the primary RFS can write the redo into standby redo logs or directly to archived redo logs. Each LNSn and ARCHn process from the primary database has its own RFS process.
-
Archiver (ARCHn) – ARCH1 process archives the standby redo
-
Managed recovery process (MRP) – exists only for physical standby database only, MRP applies archived redo log information to the physical standby If you start the managed recovery with the ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT sql statement, this foreground session performs the recovery. If you use the optional DISCONNECT FROM SESSION clause, the MRP background process starts. If you use Data Guard broker to manage your standby databases, the broker always starts the MRP background process for a physical standby database.
2.2 Enable forced logging
In order to implement Standby Database I enable ‘Forced Logging’. This option ensures that even in the event that a ‘nologging’ operation is done, force logging takes precedence and all operations are logged into the redo logs.
SQL> alter database force logging;
Database altered.
2.3 Configure a standby redo log in Primary
A Standby Redo log is added to enable Data Guard Maximum Availability and Maximum Protection modes. It is important to configure the Standby Redo Logs (SRL) with the same size as the online redo logs.
Standby redolog should be 1 greater than no of primary redo log groups per thread.If You have 2 node RAC and thread has 4 redo log groups,then number of standby redo log will be 5 per thread.Standby redo log should never be multiplexed.
alter system set standby_file_management=manual scope=both sid=’*’;
alter database add standby logfile thread 1 group 9 ‘+FRA’ size 512m;
alter database add standby logfile thread 1 group 10 ‘+FRA’ size 512m;
alter database add standby logfile thread 1 group 11 ‘+FRA’ size 512m;
alter database add standby logfile thread 1 group 12 ‘+FRA’ size 512m;
alter database add standby logfile thread 1 group 13 ‘+FRA’ size 512m;
alter database add standby logfile thread 2 group 14 ‘+FRA’ size 512m;
alter database add standby logfile thread 2 group 15 ‘+FRA’ size 512m;
alter database add standby logfile thread 2 group 16 ‘+FRA’ size 512m;
alter database add standby logfile thread 2 group 17 ‘+FRA’ size 512m;
alter database add standby logfile thread 2 group 18 ‘+FRA’ size 512m;
alter system set standby_file_management=auto scope=both sid=’*’;
2.4 Enable archive mode in Primay
Check whether archive mode is enabled in database if it is not enabled:
SQL> archive log list
If archive mode was not enabled please execute below commands keeping one node of RAC down:
SQL> shutdown immediate
2.5 Create directories adump in Standby
2.6 Copy password file using ASM command from primary to standby.
In Primary
In Standby
2.7 Add database and instance in CRS using srvctl in standby
In Standby,to add password file
To modify password file
2.8 Copy pfile and start instance in each node of standby with pfile
copy pfile from /tmp to $ORACLE_HOME/dbs and rename the parameter file to init$ORACLE_SID.ora in each node and startup nomount mode for each node of DR.
Please change db_unique_name and SID name in pfile from DR side.
[oracle@Delhi1 ~]$ sqlplus / as sysdba
If you face permission denied issue in ASM,please include oracle under asmadmin
2.9 Please check If you connect with SYS user to RMAN in each node individually
2.10 Now We need to take RMAN backup either in ASM diskgroup or any other mount point.
{
sql “alter system switch logfile”;
allocate channel ch1 type disk format ‘+DATA/ORCLPD1/BKP/Primary_bkp_for_stndby_%U’;
allocate channel ch2 type disk format ‘+DATA/ORCLPD1/BKP/Primary_bkp_for_stndby_%U’;
BACKUP AS COMPRESSED BACKUPSET DATABASE;
backup current controlfile for standby;
sql “alter system archive log current”;
}
Non-ASM Backup
{
sql “alter system switch logfile”;
allocate channel ch1 type disk format ‘/U01/ORCLPD1/BKP/Primary_bkp_for_stndby_%U’;
allocate channel ch2 type disk format ‘/U01/ORCLPD1/BKP/Primary_bkp_for_stndby_%U’;
BACKUP AS COMPRESSED BACKUPSET DATABASE;
backup current controlfile for standby;
sql “alter system archive log current”;
}
2.11 Manual backup and duplicate for standby(This is not active duplicate)
If you use NFS as share mount point for backup,please use it.No need to copy ASM files as above.
Login to DR dataase
export ORACLE_SID=ORCLDR11
sqlplus / as sysdba
2.12 Automated backup and restoration to DR (This is active duplicate)
Log file:-
Note:- If you have ASM diskgroup name different from primary,then you need to add below parameters in above script
2.13 Re-create standby logfiles
2.14 Please create spfile add instance for RAC
Please check the restore log file to identify where control file has been created.
Now you need to change control_files parameter after startup nomount.
2.15 DGMGRL configuration
In Primary:-
ADD DATABASE ORCLDR1 AS CONNECT IDENTIFIER IS ORCLDR1;
2.16 Change spfile if required to reflect DR related change in DR site (Delhi) .(Not required if you used DGMGRL)
In Primary:-
In Standby:-
2.17 Start the managed recovery process if not already started.
If you want to setup active dataguard,Active dataguard require seperate license cost.
1.Mount the database.
2.Open the database.
3.Start log apply using following command:-
alter database recover managed standby database using current logfile disconnect;
2.18 Check the status of redo log apply in standby database. This should be executed in primary
FROM
(SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH,
(SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL
WHERE
ARCH.THREAD# = APPL.THREAD#
ORDER BY 1;
2.19 Check the alert log files in both standby and primary.
2.20 The process to switchover and failover dataguard
During actual DR failover when prod primary will not be available permanently then DR standby will become primary and one standby database to be created on DR side node2.
- Switchover:
Before performing switchover, kindly verify the state of data guard on both the instances by following SQL queries:
Note: This command will give you appropriate message about the data guard current status.
After getting confirmation on data guard smooth activity, we can instantiate switchover activity by issuing following sort of commands:
On Primary database:
Step-1>>
Connect to Primary database and convert primary database to standby.
Step-2>>
Shutdown primary database:
Step-3>>
Startup nomount old primary database as new standby database:
Verify database role on old primary database:
On Standby database:
Step-4>>
On original standby database, Convert old standby database to primary database:
Step-5>>
Convert old standby database as primary and shutdown database:
Step-6>>
Startup old standby database as primary database:
Verify database role on old standby database:
Switch over activity have completed successfully, our old primary database has become standby and old standby database has become primary database.
Note
To test above switchover activity, generate multiple archive logs on primary database and verify those archive logs being transferred on standby database.
Switchback:
To switchback, you have to follow same above mentioned 6 steps.
-
FAILOVER:
In short, the failover is the deformation of the production (primary) database and activating standby database as the primary. It is not reversible. When enabled, re-create the standby database . What to do in case of failover:
(Important note: Kolkata is the primary server and Delhi is the standby server)
-
[Kolkata] If the primary database is accessible and running, then it must provide to send redo buffer to the standby database.
If you don’t receive an error, you can continue with step 5th. In this case, the system can be opened by zero data loss. If you receive an error, We continue with step 2 to open the system at least data loss.
-
[Delhi] We must run the following query to learn last applied archive log sequence number.
-
[Delhi] If you can access archive logs which are not copied to standby then copy archives to standby. After copy archive log files we must register them to standby database. This operation must be done for every thread.
-
[Delhi] Check the standby database for redo gap. If there is a gap then we must copy archive log files and register.
As a result of the above query until it returns to zero.
-
[Delhi] Stop the redo apply process in standby database.
-
[Delhi] Finish to apply archive logs copied from primary.
If you get an error, it means there are redo logs not applied. Consider 2th and 4th steps. You can also continue with following command;
In this situation you can open database in 8th step. If you get no error, continue with 7th step.
-
[Delhi] Switch standby database to primary database.
-
[Delhi] Open database.