Media recovery Terminated with ORA-00600
Recently worked on huge transactional production database of 11.2.0.2, It has Standby database with Real-time Apply, In part of journal archiving the data will be inserted into other tables and same data will be purged from the Huge Ledger tables.
Log History on 29th September between 1 AM to 3AM.
Date Day Total h00 h01 h02 h03
--------- ----- ----- --- --- --- ---
27-SEP-12 Thu 1207 0 5 0 12
28-SEP-12 Fri 728 1 84 21 97
29-SEP-12 Sat 668 90 372 99 1
30-SEP-12 Sun 83 1 2 1 1
Total number of rows archived are “19351374” and according to above log history on average total archives are 472 and each redo log file size is 100mb which is 472 * 100mb = 46 GB of redo in just 2 hours. This production database has standby database with Real-Time apply, During high redo unable to archive quickly on standby database of course all of the archives are transferred. But MRP went smooth up to archive log sequence “147613”, When MRP is trying to apply the next sequence “147614”, Process been terminated and returned internal errors as below.
Sat Sep 29 06:49:21 2012 Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT USING CURRENT LOGFILE Media Recovery Log /u04/arch/ckpt/1_147614_751616314.dbf Errors in file /u00/app/oracle/diag/rdbms/ckptsby1/ckpt/trace/ckpt_pr00_21586.trc (incident=582898): ORA-00600: internal error code, arguments: [2619], [147614], [], [], [], [], [], [], [], [], [], [] Incident details in: /u00/app/oracle/diag/rdbms/ckptsby1/ckpt/incident/incdir_582898/ckpt_pr00_21586_i582898.trc
Whenever Internal errors occurred we usually check look up tool , if there is any listed bugs found we follow work around and submit SR to Oracle Support, But not easy to get response on weekends and need to fix this issue ASAP. Lets do some troubleshooting here. Initially bounced standby database still errors remain same and MRP been terminated as usual.
Total How many Archives been generated on Primary?
SQL> select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- 147729 SQL>
Now what is the latest sequence been applied on Standby?
SQL> select max(sequence#) from v$archived_log where applied='YES'; MAX(SEQUENCE#) -------------- 147613 SQL>
so around 116 of archives are not applied and still many of archives can be generated in primary database, Quickly most of them think to either to perform 1) Incremental Roll forward (or) 2) Rebuild standby database because of internal errors when there is no outstanding time to work. But in some cases still we can deal Internal errors(ORA-00600) depending on our analysis.
Now revise our findings, The maximum sequence applied on standby is “147613”, Wait am giving you small TIP 🙂 check what is the second argument in internal error? It’s “ORA-00600: internal error code, arguments: [2619], [147614]” So whatever the first argument either may it relates to the FRA (or) archive location size but second argument referring to the Next sequence which is not applied on standby.. BINGO!!! So the issue is with the archive log sequence “147614“.
So if the archive log is corrupted then how to resolve this GAP? Straight away answer is Incremental Roll forward, You can also have a try with Primary archive log(May be Primary archive also corrupted). Because there may be corruption in Archive log of standby may be during the transport over network or any physical corruption of Standby redo log files.
So first copy the primary archive log sequence to standby database and see if there is any differences in Sizes of both archives logs of primary and standby Archive logs.
[email protected]> ls -ltr 1_147614_751616314.dbf_primary 1_147614_751616314.dbf_standby -rw-r----- 1 oracle oinstall 69396480 Sep 29 10:53 1_147614_751616314.dbf_standby -rw-r----- 1 oracle oinstall 69396480 Sep 29 10:56 1_147614_751616314.dbf_primary [email protected]>
Here both archive logs permissions are same, File sizes are also same. Hmm. Will see any difference in both of the files by Linux commands
[email protected]> diff 1_147614_751616314.dbf_primary 1_147614_751616314.dbf_standby Binary files 1_147614_751616314.dbf_primary and 1_147614_751616314.dbf_standby differ [email protected]> echo $? 2 [email protected]>
So according to the above command, both the archives from primary and standby are differ. Now will perform more investigation using Log Miner.
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE('/u00/app/oracle/standby_arch/1_147614_751616314.dbf_standby' ,options => DBMS_LOGMNR.NEW); PL/SQL procedure successfully completed. SQL> EXECUTE DBMS_LOGMNR.START_LOGMNR(); PL/SQL procedure successfully completed. SQL> SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS; SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS * ERROR at line 1: ORA-00600: internal error code, arguments: [2619], [147614], [], [], [], [], [], [], [], [], [], [] SQL>
If analyzed primary archive redo log file, Log miner able to read without any errors.
SQL> SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS; RBASQN RBABLK RBABYTE OPERATION STATUS INFO ---------- ---------- ---------- -------------------------------- ---------- -------------------------------- 147614 3262 368 COMMIT 0 147614 3264 472 DELETE 2 Dictionary Mismatch 147614 3266 184 DELETE 2 Dictionary Mismatch 147614 3267 392 DELETE 2 Dictionary Mismatch 147614 3269 104 DELETE 2 Dictionary Mismatch 147614 3270 312 DELETE 2 Dictionary Mismatch 147614 3272 24 DELETE 2 Dictionary Mismatch 147614 3273 232 DELETE 2 Dictionary Mismatch 147614 3275 16 DELETE 2 Dictionary Mismatch 147614 3276 224 DELETE 2 Dictionary Mismatch 147614 3277 432 DELETE 2 Dictionary Mismatch
So there is no corruption also, But internal errors which is almost our out of Hands? Still you can perform manual recovery of standby with the only log which is corrupted and then lets start MRP and see how it goes.
SQL> recover standby database until cancel; ORA-00279: change 2209191282893 generated at 09/29/2012 01:58:45 needed for thread 1 ORA-00289: suggestion :/u04/arch/ckpt/1_147614_751616314.dbf ORA-00280: change 2209191282893 for thread 1 is in sequence #147614 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} /u00/app/oracle/standby_arch/1_147614_751616314.db f ORA-00279: change 2209191284000 generated at 09/29/2012 01:59:00 needed for thread 1 ORA-00289: suggestion : /u04/arch/ckpt/1_147615_751616314.dbf ORA-00280: change 2209191284000 for thread 1 is in sequence #147615 ORA-00278: log file '/u00/app/oracle/standby_arch/1_147614_751616314.dbf' no longer needed for this recovery Specify log: {<RET>=suggested | filename | AUTO | CANCEL} /u00/app/oracle/standby_arch/1_147615_751616314.dbf
Now manually archive log sequence of primary “147614” Successfully applied and requesting for the next sequence. Alert log information as below
Sat Sep 29 07:41:56 2012
ALTER DATABASE RECOVER standby database until cancel
Media Recovery Start
started logmerger process
Sat Sep 29 07:41:56 2012
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 32 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER standby database until cancel ...
Sat Sep 29 07:42:24 2012
ALTER DATABASE RECOVER LOGFILE '/u00/app/oracle/standby_arch/1_147614_751616314.dbf'
Media Recovery Log /u00/app/oracle/standby_arch/1_147614_751616314.dbf
ORA-279 signalled during: ALTER DATABASE RECOVER LOGFILE '/u00/app/oracle/standby_arch/1_147614_751616314.dbf' ...
Now cancelled the recovery and started Broker, So that MRP process will be initiated.
SQL> alter system set dg_broker_start=true; System altered. SQL>
When started Broker(DMON) sucessfully MRP also started and applying archives from “147615” without any errors, All the archives applied and now standby is completely in SYNC with primary database.
Primary:- SQL> select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- 147849 SQL> Standby:- SQL> select max(sequence#) from v$archived_log where applied='YES'; MAX(SEQUENCE#) -------------- 147848 SQL>
Now the issue is resolved, But what is the issue with archive log sequence “147614” ? Still its unanswered? Aaah.. I just tried to delete the archive log using RMAN after performing manual copy and below is the issue.
Sat Sep 29 08:01:01 2012
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
***
Corrupt block seq: 147614 blocknum=1.
Bad header found during deleting archived log
Data in bad block - flag:1. format:34. bno:1. seq:147614
beg:0 cks:47474
calculated check value: 47474
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
***
But why Log Miner unable to show the Block corruption issues when this archive log is analyzed? But I think its issue more with the BAD HEADER… So in short there may be less control on Internal errors still sometimes we can escape from those errors some times.
— Happy Reading… Thank you —