RMAN Incremental Backups to Roll Forward a Physical Standby Database
What Is an Archive Gap:-
An archive gap is a range of archived redo logs created whenever the standby system is unable to receive the next archived redo log generated by the primary database. For example, an archive gap occurs when the network becomes unavailable and automatic archiving from the primary database to the standby database stops. When the network is available again, automatic transmission of the redo data from the primary database to the failed standby database resumes.
Causes:-
The steps in this section can used to resolve problems if a physical standby database has
1) Loss of Archive log
2) Corrupted Archive Redo Data
3) Unresolvable Archive GAP’s
4) Out of Retention policy
WorkArounds:-
1) If archive exist on primary Troubleshoot the issue, It will automatically shipped if the archive is not corrupted.
2) If archive is corrupted at OS level, if we do have backup, Then Restore archivelogs from Backup.
3) Incremental SCN for standby/ Roll Forward.
Environment:-
DB Version:- 11.1.0.7.0
OS version: x86_64 GNU/Linux
In Archive GAP issues, familiarly we can see such general errors below in alert log file from standby database.
Analysis
Check the archives generated on Primary for each thread.
Check the archives applied on Standby for each thread.
Here we can see lot of difference in primary & standby around 1300 of archives for each thread, and caused because one of the archive is missing on primary database, i.e. Thread#1 and sequence#63590
We can query from v$archive_gap.
Of course, we have missed only one file, But standby is far behind than primary, Instead of copying those archives from primary & standby, Here Rollforward is very quick.
Here primary & standby redo difference is around
Thread #1 (50mb logfile * 1300) = 65000 mb
Thread # 2 (50mb logfile *1300) = 65000 mb
Its around 130gb of redo data need to apply on standby.
If the difference between primary and standby is more & more, recommended to take Incremental SCN/roll forward to reduce recovery window.
Stop the MRP process.
SQL> alter database recover managed standby database cancel;
Capture the CURRENT_SCN from Standby & Primary.
Compare the SCN in primary & standby databases.
SCN Difference = 8405028062 – 8276807480 = 128220582
Now take the backup of increments & controlfile.
1) From SCN 8276807480 until database.
– RMAN> backup incremental from scn 8276807480 database;
2) Backup of standby controlfile.
– RMAN> backup current controlfile for standby;
Backup incremental.
Backup current control file for standby.
Copy the backup’s to standby server.
Tasks before restore.
1) Bounce instance in “NOMOUNT” status.
2) Remove old control files as per the value of “control_files” in PFILE/SPFILE.
3) We can use either RMAN/SQL to create controlfile, I have been used RMAN , and restored using RMAN.
4) Mount the standby database.
Bounce in “NOMOUNT”
Restore standby control file from BACKUP.
Mount the database and check the status, database role.
Register backups with new control file.
Perform media recovery from RMAN as below.
Check the alert log file as each of the datafile is being recovered.
After successful recovery, Start MRP once again
Now Archives are transporting to standby and also performing media recovery with new archives.
Check the Archive sync in primary & standby databases.
Primary:-
Standby:-
Gaps:-
Useful Views for DataGuard:-
V$managed_standby v$archive_dest V$dataguard_status v$archive_dest_statusReference Links:-
Steps to perform for Rolling forward a standby database using RMAN incremental backup when primary and standby are in ASM filesystem [ID 836986.1]
10.2: http://download.oracle.com/docs/cd/B19306_01/server.102/b14239/scenarios.htm#CIHIAADC
11.1: http://download.oracle.com/docs/cd/B28359_01/server.111/b28294/rman.htm#SBYDB00759
11.2: http://download.oracle.com/docs/cd/E11882_01/server.112/e17022/rman.htm#CIHIAADC
Hi Nassyam,
Thanks for the detailed explaination. Can i know why we recover the database with NOREDO option
Thanks
Karthik
Hi Karthik,
Its good question, it bypass(not to use) redo logs during recovery. In short the NOREDO option is required if redo logs are not available.
Hope this helps.