路虽弥,不行不至;事虽少,不做不成。
分类: Oracle
2007-06-27 09:06:46
Redo Log File Management and Issues when using RMAN to do Automatic Recovery
PURPOSE ~~~~~~~~~~~~~~~~~~~ This discussion is the result of numerous customer requests made to Oracle Support Services regarding the management of the redo log files when using RMAN to automate on recovery. Automatism on recovery is done by auto-inspecting all destinations of the files that belong to the database, by identifying the files that are missing and by choosing the recovery path accordingly. This document is intended to show when, and in which situations RMAN is able to do this auto-inspection by itself, without manual intervention. It also discusses current limitations. The discussion attempts to clarify the need for manual intervention by the DBA that is requested before recovery, i.e shows the amount of work uploaded to the DBA, as well as what the DBAs need to check before starting recovery. It also attempts to explain the errors resulting from this issue, to give customers the possibility to handle them without Oracle Support assistance. It should help DBAs get a better understanding of the way RMAN works and to help them in the process of automatizing recovery with RMAN. It is the task of the DBA to pre-process RMAN recovery by writing customized OS shell scripts that auto-inspect the destinations of the database files after a media failure, and dynamically create RMAN recovery scripts. TEST ENVIRONMENT ~~~~~~~~~~~~~~~~ All tests were performed with 8.1.6 on Windows NT. The prerequisites are the use of an RMAN catalog. SCOPE & APPLICATION ~~~~~~~~~~~~~~~~~~~ This document is intended to provide an understanding of the way RMAN manages the redo log files when testing recovery concepts. This information can also be used by customers who intend to automatize the recovery process. Please note that this article does not currently discuss backup/recovery concepts and does not supply DBAs with shell scripts or SQL scripts for automatic backup and recovery operations. It is intended to clarify some typical error situations encountered on recover, and helps DBAs to decide how far they can go in the attempt to automatize this operation. This article concentrates primarily on the way the log files (archived and online redo logs) are managed with RMAN, the related errors during recovery, and the manual intervention needed to handle these common errors. It is assumed that the reader is familiar with RMAN and has consolidated recovery knowledge. This article is laid out as follows: Part I Explanation for the need for manual intervention on recover Part II Case studies and error explanation The following scenarios and related errors are analyzed and explained in this article: Case 1: Some archived log files are NOT BACKED UP, NOT CATALOGED, but are ON DISK RMAN-03013: command type: recover RMAN-20000: abnormal termination of job step RMAN-06054: media recovery requesting unknown log: thread 1 scn 822898 Case 2: Some archived log files are NOT BACKED UP, are CATALOGED, but are NOT ON DISK RMAN-03013: command type: recover RMAN-06053: unable to perform media recovery because of missing log RMAN-06025: no backup of log thread 1 seq 6 scn 843036 found to restore Case 3: Online redo logs that are not current are lost, but CATALOGED archived logs with the same seq# are ON DISK. Case 4: Only the current redo log is lost. Part III Sample RMAN scripts for backup and recover used in the tests Part IV 9i enhancements related to automatic recovery with RMAN Part I Explanation for the need for manual intervention on recover How RMAN manages the archived log files and the implications for recovery ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This section also explains the RMAN commands: RESYNC and CATALOG ARCHIVELOG. During recovery, RMAN (up to version 8.1.7), does not scan the disk (as may be expected), to automatically search for unknown archived log files. The destination of the archived log files is scanned only to identify the known archived log files. We will try to explain this mechanism, where archived log files are 'known' and where they are not. RMAN relies on two information sources (repositories) which are used in following order: 1. the information recorded in the current controlfile 2. the information recorded in the RMAN catalog at the time of media failure The most up-to-date information about files belonging to the database is recorded automatically only in the CURRENT CONTROLFILE immediately after log sequence X was archived on disk. After completion, a new record is added to the controlfile to protocol this action (we could say the controlfile 'knows' immediately about the archived log file). At the time the log was archived, the RMAN catalog has no 'knowledge' about this file, nor about all other files archived after that and available on disk. The information is transferred from the CURRENT CONTROLFILE in the RMAN CATALOG only if the DBA is starting RMAN, connecting to the database and to the RMAN catalog and runing the RMAN command 'RESYNC' (or every other RMAN command that would do an implicit 'RESYNC'). There is no process implemented in the database that does this automatically. The command for doing a manual complete RESYNC is: eg: RMAN> resync catalog; (for more information about the RESYNC command please see documentation) eg: before next RESYNC after RESYNC arch seq# : 3 4 5 6 7 8 9 3 4 5 6 7 8 9 on disk : |-----------------| |-----------------| recorded in controlfile : |-----------|-----| |-----------------| | information from controlfile V | NOT CATALOGED ('unknown') V transfered to catalog recorded in catalog : |--------| |-----------------| | | CATALOGED CATALOGED ('known') NOTE: Archived log files that are recorded in the catalog are called CATALOGED archived log files; only CATALOGED files are 'known' to RMAN That means there is only one scenario where the CATALOG has the most up-to-date information about the log files that were archived on disk; if after every log switch RMAN is started and a RESYNC is done! In all other situations, only the current database controlfile has the current information about the log sequences that are archived on disk. This has some implications on the recovery process when the current controlfile is lost. RMAN limitations regarding automatic recovery of archived log files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When a crash happens between the RESYNC operation, some archived log files available on disk are NOT CATALOGED. If the CURRENT CONTROLFILE is also lost after a crash, the possibility to transfer the information about the archived logs available on disk into the CATALOG using RESYNC (manually or implicit) is gone. A BACKUP CONTROLFILE does not have this current information. As already explained, during recovery RMAN until 9.0.0.0 does not scan the archive log destinations to search for 'unknown' archived log files and to 'catalog' the ones found automatically. In this case, recovery will fail, and manual intervention is needed. In additional to RESYNC there is another RMAN command that records the information about the archive log files available on disk in the RMAN CATALOG. This command is 'catalog archivelog'. DBAs need to run this command manually for each UNCATALOGED archived log file available on disk and needed for recovery. We can illustrate this with following example: RMAN> catalog archivelog 'D:\ARC00007.001'; (for more informations about CATALOG command please see documentation) situation on disk after after manual intervention crash: RMAN> CATALOG archivelog ... arch seq# : 3 4 5 6 7 8 9 3 4 5 6 7 8 9 on disk : |-----------|-----| |-----------------| recorded in catalog: |--------| | |-----------------| | | | CATALOGED NOT CATALOGED CATALOGED ARCH FILES RMAN RECOVERY stops at seq# 6 and errors ------>| after manual 'catalog archivelog' RMAN RECOVERY can apply all archived log ---------------->| files available on disk if needed for recovery Manual intervention is not needed when recovery is started using the CURRENT CONTROLFILE, because an implicit RESYNC is done on recover. This way all 'unknown' archived log file available on disk are automatically cataloged. It is important to understand that RMAN basically works only with CATALOGED ('known') log files. This is a little different to the way recovery is done with server manager. This is why there is an explanation demand for this issue. NOTE: Always catalog all archived logs available on disk before starting recovery using backup controlfile. How RMAN manages the ONLINE log files and the implications for recovery ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Regarding the management of the online redo logs on recovery, RMAN can automatize the process in some situations, more than server manager can do. The location and the names of the online redo logs is known by RMAN. This location in recorded in the catalog in the RC_REDO_LOGS view. As they are reused in recycled order, their sequences change after each log switch. From the RMAN point of view, the seq# for the online redo logs is important to be known only in recovery situations. The need to RESYNC is done ONLY if a new redo log member is added to the database. Thus, in recovery situations, you need to catalog all online redo logs ONLY if a failure occurs after a new online redo log was added, and no RESYNC was done in between. In this case the new redo log is completely 'unknown' to RMAN. Hence, there is a need for manual intervention in the same way as for the 'unknown' archived redo logs. We need to catalog the 'unknown' online redo logs needed for recovery. This is an exceptional case, but is very important. In some recovery situations, RMAN searches for the 'known' online redo logs in the log destination on disk, and records the seq# of the all redo logs found. In this way, RMAN 'knows' about the seq# of the redo logs available on disk after system failure, and can pass them to the recovery process when the related sequence is needed. In other words, RMAN can catalog the available online redo logs automaticaly by auto-inspecting the log destination during recovery. NOTE: This is not the way RMAN handles the archived redo logs at the moment. Furthermore, if the online redo log searched for, cannot be found on disk, but the archived redo log with the same seq# is cataloged and available on disk, RMAN is able to apply the archived log file instead of the missing redo log for the requested seq#. (see illustrations below) On the other hand, if an online redo log is not found on disk, and no archived log file for this seq# exists, RMAN will report it as UNKNOWN. So, a cataloged, known online redo log can become 'unknown' during recovery, if it cannot be found on disk, and no archived log that could replace it, exists. This mainly happens when the CURRENT redo log file is lost and is requested for recovery. To show the way RMAN handles the online redo logs on recovery we need to analyze two situations: (1) recovery using current controlfile and (2) recovery using backup controlfile. Because the application of online redo logs is mainly requested on complete recovery we will illustrate this situation. We assume that all ARCHIVED REDO LOGS are CATALOGED and available on disk. situation on disk after crash NOTE: For seq# 8 and 9 there are two versions of logs on disk: the online log and the archived log for each seq# log seq# : 3 4 5 6 7 8 9 10 11 12 ARCHIVED seq# on disk : |--------------------------| ONLINE redo logs on disk : |-------------| V CURRENT LOG:seq#10 logs recorded in catalog : |--------------------------| | |-------------| V V CATALOGED: ARCHIVED LOGS ONLINE LOGS RMAN limitations regarding automatic recovery of online redo log files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The limitations in this area depend on following situations: COMPLETE RECOVERY using CURRENT CONTROLFILE 3 4 5 6 7 8 9 10 11 seq# archived applies all CATALOGED ARCH LOGS up to the last ------------>\ seq before the oldest ONLINE log, here seq# 8 \ 9 10 12 seq# online then it requests all ONLINE logs \-------------->| | V IMPLICATION of this behaviour: online redo seq# 11 is lost so, recovery stops with an ERROR In the case above, RMAN does not auto-inspect the log destination to search for the online redo logs and catalog their seq#. The recovery process will always request the online redo logs and not the archived log with the same seq#. This behaviour results in recovery being aborted when one of the online redo logs is not available, even if the archived version of this log exists and is 'known' to RMAN. COMPLETE RECOVERY using BACKUP CONTROLFILE 3 4 5 6 7 8 9 10 11 seq# archived applies all CATALOGED ARCH LOGS up to the last ------------>\ /--\ sequence before the oldest ONLINE log FOUND ON DISK, \-------/ \->| then switches to the available log with the 9 10 12 seq# online sequence choosing between online and archived logs | V IMPLICATION of this behaviour: online redo seq# 11 is lost arch log seq# 11 is applied instead recovery completes successfully In this case, RMAN auto-inspects the log destination to search for the online redo logs and automaticaly catalogs the seq# of the logs found. If the online redo log searched for, cannot be found on disk, but the archived redo log with the same seq# is cataloged and available on disk, RMAN is able to apply the archived log file instead of the missing redo log for the requested seq#. Here, RMAN is automatizing recovery as far as possible and is more proficient than server manager. NOTE: be aware that you can use the automatism RMAN has regarding the online log files on recover only if you start recover using backup controlfile Situations that need to be handled and what to check to identify gaps in the log sequence ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Before starting recovery we need to evaluate the situation of the log files available on disk, after system failure. We can have the following general situations and related common RMAN errors. NOTE: We do not need to be concerned with the backed up files. They can be restored from the backup. eg: 1. log seq# : 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ON DISK: ARCH : |------------------------------------------| ONLINE logs: |-----------| BACKED UP : |-----| CATALOGED : |---------------------| |-----------| |<---------------------->| | NOT BACKED UP,NOT CATALOGED, ON DISK | V These are 'unknown' logs and RMAN cannot recover them, fails with: RMAN-20000: abnormal termination of job step RMAN-06054: media recovery requesting unknown log SOLUTION: CATALOG archived logs from seq# 11 to seq# 16 Start COMPLETE RECOVERY eg: 2. log seq# : 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ON DISK: ARCH : |-----| GAP on disk |----------------------------| ONLINE logs : |-----------| BACKED UP : |-----| CATALOGED : |------------------------------------| |-----------| |<------>| | NOT BACKED UP,CATALOGED, NOT ON DISK | V These logs are 'missing' and RMAN searches a backup of them, and as none can be found, recovery will fail with: RMAN-06053: unable to perform media recovery because of missing log RMAN-06025: no backup of log thread 1 seq 6 scn 843036 found to restore NOTE: RMAN can identify gaps in the sequence of the 'known' archived log files SOLUTION: In this case we have a gap in the seq# of the archived log files. The state of the archived log files after that first gap is irrelevant, because we only cannot recover over it. Start INCOMPLETE RECOVERY UNTIL SEQUENCE 9 If you want to automatize recovery, you first need to evaluate the situation on disk. You need to search for gaps in the sequence of the archived log files and online log files before starting recovery. eg: 3. archived logs : in backup on disk |-----| |---------------------------------------------| log seq# : 3 4 5 6 7 8 9 10 11 12 14 15 17 13 14 16 |------------------------------| online redo logs V current |----------------------| |------------------| | | | | V V V Search for gaps: gap here? gaps in this area? gap in this area? you need to check if | at least one version ????? | arch log file or redo log file | for each seq# exists | if there is one sequence where | both are lost, you have a gap | Gap found: | 1. catalog all archived logs up to the gap | 2. start incomplete recovery with or without backup controlfile | | at seq 11 --- recover until seq 11 ----->| | at seq#15 --- recover until seq 15 ----------------------------->| V at seq#17 --- recover until seq 17 ----------------------------------->| all current logs lost No gap found, but some online logs are lost 1. first catalog all archived logs on disk 2. start complete recovery using backup controlfile -------------------------------->| ^^^^^|^^^^^^ V As discussed above, RMAN takes the archived log if no online log found for the needed seq# Doing so, you can use the RMAN automatism in this situation NOTE:The current controlfile needs to be saved before Identifying the gaps means you need to find the sequence numbers of the archived logs in the backup, the sequence numbers of the archived logs on disk, and the sequence numbers of the redo log files that were current before the media failure occured. The sequence numbers of the BACKED UP archived log files can be found in the RMAN catalog. The sequence numbers of the CATALOGED archived log files can also be found in the RMAN catalog. The sequence numbers of the archived log files on disk are retrieved inspecting all archived log destinations. The available online redo log files are retrieved by inspecting all log file destinations, and the related sequences can be found by querying the controlfile views or from the alert log. If the current controlfile is available you can mount it and join V$LOG and V$LOGFILE to find out the sequences of the online log files. If you have lost the current controlfile you cannot query the database before recover. The only way to get this information is to scan the alert log from bottom up to find the last group of log switches (in order to see last completed log switch) for all members of the redo log groups you have. eg: 4. example of entries if you have 3 redo log groups (one member in each group) Thread 1 opened at log sequence 20 Current log# 1 seq# 20 mem# 0: D:\816\ORADATA\ORA816\REDO03.LOG ... ---> oldest online redo log :seq# 20 Tue Mar 27 20:35:59 2001 Thread 1 advanced to log sequence 21 Current log# 2 seq# 21 mem# 0: D:\816\ORADATA\ORA816\REDO02.LOG ... ---> next online redo log :seq# 21 Tue Mar 27 20:36:15 2001 Thread 1 advanced to log sequence 22 Current log# 3 seq# 22 mem# 0: D:\816\ORADATA\ORA816\REDO01.LOG ---> CURRENT redo log: seq# 22 Tue Mar 27 20:36:15 2001 ARCH: Beginning to archive log# 2 seq# 21 ARCH: Completed archiving log# 2 seq# 21 ---> last ARCHIVED log: seq# 21 ARCHIVED logs : 15 16 17 18 19 20 21 seq# --------------------------------| ONLINE redo logs: 20 21 22 |-----------| V | V oldest redo<=REDO03.LOG | REDO01.LOG=>CURRENT V next redo<=REDO02.LOG @ another possibility is the scan the redo log file headers, but this is not for customer. Part II Case analysis and error explanation This section reproduces the error situations described in Part I, using worked examples. We use simple queries on the RMAN catalog and inspect the log file destinations using OS commands to evaluate the gaps in the sequence numbers. All steps are performed manually. The steps performed in the analyze of each case are: 1. Collect the informations you need about the archived and online redo log files. 1.1 Find the database ID and the current databse INCARNATION (needed to scan the catalog) 1.2 Find the seq# of the CATALOGED archived log files 1.3 Find the seq# of the BACKED UP archived log files 1.4 Find the seq# of the archived log files available on disk 1.5 Find the seq# of the online redo logs availabe on disk 2. Evaluate the collected information 3. Explain the errors on recovery and interpret the RMAN errorstack 4. Handle the reproduced error accordingly 5. Present the solutions for the analyzed case Case 1 ====== Some archived log files are NOT BACKED UP, NOT CATALOGED, but are ON DISK Case 2 ====== Some archived log files are NOT BACKED UP, are CATALOGED, but are NOT ON DISK Case 3 ====== Online redo logs that are not current are lost, but CATALOGED archived logs with the same seq# are ON DISK. Case 4 ====== Only the current redo log is lost. Case 1 ====== Some archived log files are NOT BACKED UP, are NOT CATALOGED, but are available ON DISK This situation occurs primarily when you lose the current controlfile. During recovery using backup controlfile the following errors can be raised: RMAN-03013: command type: recover RMAN-20000: abnormal termination of job step RMAN-06054: media recovery requesting unknown log: thread 1 scn 822898 Below is the worked example that explains the situation for this error, how to evaluate and solve it. 1. Collect the information you need to evaluate the situation 1.1 Find the database ID and the current database INCARNATION Query RMAN catalog: svrmgrl>select * from rc_database_incarnation; DB_KEY DBID DBINC_KEY NAME RESETLOGS_ RESETLOGS CUR PARENT_DBI ---------- ---------- ---------- -------- ---------- --------- --- ---------- 1 1519956463 12 UNKNOWN 782197 14-DEC-00 NO 1 1519956463 2 ORA816 782306 14-DEC-00 YES We only have one database registered in the RMAN catalog. The current incarnation is DBINC_KEY = 2 1.2 Find the seq# of the CATALOGED archived log files Query the RMAN catalog: svrmgrl> select i.DBID,a.DB_KEY,a.DBINC_KEY,a.DB_NAME,SEQUENCE#,a.FIRST_CHANGE#, a.NEXT_CHANGE#,a.COMPLETION_TIME,a.STATUS from RC_ARCHIVED_LOG a, rc_database_incarnation i where a.DBINC_KEY = i.DBINC_KEY and i.CURRENT_INCARNATION='YES' and i.DBID=1519956463 order by SEQUENCE#; DB_KEY DBINC_KEY DB_NAME SEQUENCE# FIRST_CHAN NEXT_CHANG COMPLETIO ---------- ---------- -------- ---------- ---------- ---------- --------- 1 2 ORA816 13 802821 802825 27-MAR-01 1 2 ORA816 14 802825 802828 27-MAR-01 1 2 ORA816 15 802828 802831 27-MAR-01 1 2 ORA816 16 802831 802834 27-MAR-01 1 2 ORA816 17 802834 802874 27-MAR-01 1 2 ORA816 18 802874 822898 27-MAR-01 The last cataloged archivelog has seq# 18 1.3 Find the seq# of the BACKED UP archived log files (we assume that all archived log files needed to make the last backup consistent were backed up) Query RMAN catalog: svrmgrl> select i.DBID,b.DB_KEY,b.DBINC_KEY,b.DB_NAME,SEQUENCE#,b.FIRST_CHANGE#, b.NEXT_CHANGE#,b.COMPLETION_TIME,b.STATUS from RC_BACKUP_REDOLOG b, rc_database_incarnation i where b.DBINC_KEY = i.DBINC_KEY and i.CURRENT_INCARNATION='YES' and i.DBID=1519956463 order by SEQUENCE#; DB_KEY DBINC_KEY DB_NAME SEQUENCE# FIRST_CHAN NEXT_CHANG COMPLETIO ---------- ---------- -------- ---------- ---------- ---------- --------- 1 2 ORA816 1 782306 802355 26-FEB-01 1 2 ORA816 2 802355 802429 26-FEB-01 1 2 ORA816 3 802429 802431 26-FEB-01 1 2 ORA816 4 802431 802483 26-FEB-01 1 2 ORA816 5 802483 802488 26-FEB-01 1 2 ORA816 6 802488 802787 26-MAR-01 1 2 ORA816 7 802787 802789 26-MAR-01 1 2 ORA816 8 802789 802794 26-MAR-01 1 2 ORA816 9 802794 802812 27-MAR-01 1 2 ORA816 10 802812 802817 27-MAR-01 1 2 ORA816 11 802817 802819 27-MAR-01 1 2 ORA816 12 802819 802821 27-MAR-01 1 2 ORA816 13 802821 802825 27-MAR-01 1 2 ORA816 14 802825 802828 27-MAR-01 1 2 ORA816 15 802828 802831 27-MAR-01 The last backed up archivelog has seq# 15 1.4 Find the seq# of the archived log files available on disk Inspect all archived log destinations and search for possible gaps in the sequence numbers D:\816\ORADATA\ora816\archive>ls -lrt -rw-rw-rw- 1 user group 1024 Mar 27 19:26 ARC00013.001 --> first seq# on disk -rw-rw-rw- 1 user group 1024 Mar 27 19:26 ARC00014.001 -rw-rw-rw- 1 user group 1024 Mar 27 19:26 ARC00015.001 --> last seq# backed up -rw-rw-rw- 1 user group 1024 Mar 27 19:02 ARC00016.001 -rw-rw-rw- 1 user group 17408 Mar 27 19:18 ARC00017.001 -rw-rw-rw- 1 user group 19968 Mar 27 19:27 ARC00018.001 -rw-rw-rw- 1 user group 20480 Mar 27 20:35 ARC00019.001 -rw-rw-rw- 1 user group 16896 Mar 27 20:36 ARC00020.001 -rw-rw-rw- 1 user group 2048 Mar 27 20:36 ARC00021.001 ==> the first seq# on disk is seq# 13 ==> the last seq# on disk is seq# 21 ==>there are no gaps in the sequence numbers on disk ==> last backed up seq# was 15 ==>there are no gaps in the seq# between last backed up seq# and first seq# on disk 1.5 Find the seq# of the online redo logs availabe on disk Inspect all online redo log destinations D:\816\ORADATA\ora816>ls -l|grep REDO -rw-rw-rw- 1 user group 1049088 Mar 27 20:36 REDO01.LOG -rw-rw-rw- 1 user group 1049088 Mar 27 20:36 REDO02.LOG -rw-rw-rw- 1 user group 1049088 Mar 27 20:36 REDO03.LOG We have 3 redo log groups (each having one member) ==> all redo log members can be found on disk ==> if the last sequence number of the archived logs available on disk was the last one successfully archived before the crash, then the seq# for the current logfile should be seq# 22 ==> we check the alert log file as described in Part I REDO03.LOG -> seq# 20 REDO02.LOG -> seq# 21 REDO01.LOG -> seq# 22 ==> COMPLETE RECOVERY CAN BE DONE!!! 2. Evaluate this information to be able to understand the recovery errors Above we found following situation on disk after crash: * last backed up seq# was 15 * last cataloged archivelog was seq# 18 * last archived log on disk was seq# 21 * no gaps found * evaluation: * seq# 13 14 15 -> in backupset (RC_BACKUP_REDOLOG) * seq# 13 14 15 16 17 18 -> CATALOGED archived log (RC_ARCHIVED_LOG) * seq# 13 14 15 16 17 18 19 20 21 -> archived on disk * seq# 20 21 22 -> CATALOGED online redo logs on disk (RC_REDO_LOG) * ^^ sequence 19 is UNKNOWN to RMAN * ^^ sequence 19 is on disk, but not CATALOGED and NOT BACKED UP * ^^ sequence 19 is recorded only in the current controlfile 3. Explain the errors on recovery and interpret the RMAN errorstack In this scenario, a complete recovery is possible because all data needed is available on disk or can be restored from the backup. But using RMAN this can be done only if recovery is started using the current controfile. If after the crash the current controlfile was lost, you need to restore a backup controlfile. If you start complete recovery using backup controlfile, RMAN will fail with following error: RMAN>run { allocate channel d1 type disk; restore controlfile; restore database; sql 'alter database mount'; recover database; sql 'alter database open resetlogs'; } **** interpreting the RMAN errorstack (read from bottom up) **** interpreting the related message before the stack and the real error in the stack: **** RMAN-08060: unable to find archivelog RMAN-08510: archivelog thread=1 sequence=19 ^^^^^^^^^^^ RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-20000: abnormal termination of job step RMAN-06054: media recovery requesting unknown log: thread 1 scn 822898 ^^^^^^^ * ^^ sequence 19 is UNKNOWN to RMAN * ^^ sequence 19 is on disk, but not CATALOGED and NOT BACKED UP * ^^ sequence 19 was recorded only in the current controlfile RMAN recovered up to 'known' log seq# 18 and then errors. Check the alert log to see how far recovery was done. 4. Handle the reproduced error accordingly As described in Part I, manual intervention is requested if you use a backup controlfile. You need to catalog the 'unknown' archived log files available on disk: RMAN> alter database mount; RMAN> catalog archivelog 'D:\816\ORADATA\ORA816\ARCHIVE\ARC00019.001'; NOTE: The backup controlfile needs to be mounted to be able to catalog the archived log files. Remember that every action on any file needs to be first recorded in the controlfile. After this operation you can restart the recovery step. 5. Solutions for Case 1. The solution depends on this 2 situations: CURRENT CONTROLFILE USED: ==> RMAN can handle this situation automatically Start complete recovery using current controlfile BACKUP CONTROLFILE USED: ==> RMAN up to 8.1.7 cannot handle this situation automatically - see Part V ==> The DBA has to automatize the process by pre-processing RMAN recovery Start a customized complete recovery using backup controlfile run { allocate channel d1 type disk; restore controlfile; restore database; sql 'alter database mount'; catalog archivelog 'D:\816\ORADATA\ORA816\ARCHIVE\ARC00019.001'; # dynamic coded recover database; sql 'alter database open resetlogs'; } This script could be created dynamically via an OS shell script. Case 2 ====== Some archived log files are NOT BACKED UP, are CATALOGED, but are NOT ON DISK We can get errors regardless, whether we use a backup controlfile or the current controlfile on recover. RMAN-03013: command type: recover RMAN-06053: unable to perform media recovery because of missing log RMAN-06025: no backup of log thread 1 seq 6 scn 843036 found to restore Below is the worked example that explains the situation for this error, how to evaluate and resolve it. 1. Collect the information you need to evaluate the situation 1.1 Find the database ID and the current database INCARNATION Query RMAN catalog: svrmgrl>select * from rc_database_incarnation; DB_KEY DBID DBINC_KEY NAME RESETLOGS_ RESETLOGS CUR PARENT_DBI ---------- ---------- ---------- -------- ---------- --------- --- ---------- 1 1519956463 2 ORA816 782306 14-DEC-00 NO 1 1519956463 12 UNKNOWN 782197 14-DEC-00 NO 1 1519956463 258 ORA816 842983 27-MAR-01 YES 2 We only have one database registered in the RMAN catalog The current incarnation is DBINC_KEY = 258 1.2 Find the seq# of the CATALOGED archived log files Query RMAN catalog: svrmgrl> select i.DBID,a.DB_KEY,a.DBINC_KEY,a.DB_NAME,SEQUENCE#,a.FIRST_CHANGE#, a.NEXT_CHANGE#,a.COMPLETION_TIME,a.STATUS from RC_ARCHIVED_LOG a, rc_database_incarnation i where a.DBINC_KEY = i.DBINC_KEY and i.CURRENT_INCARNATION='YES' and i.DBID=1519956463 order by SEQUENCE#; DBID DB_KEY DBINC_KEY DB_NAME SEQUENCE# FIRST_CHAN NEXT_CHANG COMPLETIO S ---------- ---------- ---------- -------- ---------- ---------- ---------- --------- - 1519956463 1 258 ORA816 1 842983 843024 28-MAR-01 A 1519956463 1 258 ORA816 2 843024 843027 28-MAR-01 A 1519956463 1 258 ORA816 3 843027 843032 28-MAR-01 A 1519956463 1 258 ORA816 4 843032 843034 28-MAR-01 A 1519956463 1 258 ORA816 5 843034 843036 28-MAR-01 A 1519956463 1 258 ORA816 6 843036 843038 28-MAR-01 A 1519956463 1 258 ORA816 7 843038 843040 28-MAR-01 A 1519956463 1 258 ORA816 8 843040 843042 28-MAR-01 A The last cataloged archivelog has seq# 8 1.3 Find the seq# of the BACKED UP archived log files (we assume that all archived log files needed to make the last backup consistent were backed up) Query RMAN catalog: svrmgrl> select i.DBID,b.DB_KEY,b.DBINC_KEY,b.DB_NAME,SEQUENCE#,b.FIRST_CHANGE#, b.NEXT_CHANGE#,b.COMPLETION_TIME,b.STATUS from RC_BACKUP_REDOLOG b, rc_database_incarnation i where b.DBINC_KEY = i.DBINC_KEY and i.CURRENT_INCARNATION='YES' and i.DBID=1519956463 order by SEQUENCE#; DBID DB_KEY DBINC_KEY DB_NAME SEQUENCE# FIRST_CHAN NEXT_CHANG COMPLETIO S ---------- ---------- ---------- -------- ---------- ---------- ---------- --------- - 1519956463 1 258 ORA816 1 842983 843024 27-MAR-01 A 1519956463 1 258 ORA816 2 843024 843027 27-MAR-01 A The last backed up archivelog has seq# 2 1.4 Find the seq# of the archived log files available on disk Inspect all archived log destinations and search for possible gaps in the sequence numbers D:\816\ORADATA\ora816\archive>ls -lrt|grep ARC -rw-rw-rw- 1 user group 1024 Mar 27 21:53 ARC00003.001 -rw-rw-rw- 1 user group 1024 Mar 27 21:53 ARC00004.001 -rw-rw-rw- 1 user group 1024 Mar 27 21:53 ARC00005.001 ->ARC00006.001 lost -rw-rw-rw- 1 user group 1024 Mar 27 21:53 ARC00007.001 -rw-rw-rw- 1 user group 1024 Mar 27 21:53 ARC00008.001 In this example: ==> the first seq# on disk is seq# 3 ==> the last seq# on disk is seq# 8 ==> the first seq# missing on disk is seq# 6 ==>there are gaps in the sequence numbers on disk ==> last backed up seq# was 2 ==> there are no gaps in the seq# between last backed up seq# and first seq# om disk ==> the missing seq# 6 in not in the backup ==> to see if there is a real gap that cannot be skiped on recover we need to first check the seq# of the oldest online redo log: if the seq# of the oldest online redo log is <= seq# 6, than this is not a real gap, because the online redo logs can be applied, if available! ==> we still need to do next step 1.5 Find the seq# of the online redo logs availabe on disk Inspect all online redo log destinations D:\816\ORADATA\ora816>ls -l|grep REDO -rw-rw-rw- 1 user group 1049088 Mar 27 21:53 REDO01.LOG -rw-rw-rw- 1 user group 1049088 Mar 27 21:53 REDO02.LOG -rw-rw-rw- 1 user group 1049088 Mar 27 21:53 REDO03.LOG Here, we have 3 redo log groups (each having one member) ==> all redo log members can be found on disk ==> we check the alert log file as described in Part I, and find following REDO03.LOG -> seq# 7 REDO02.LOG -> seq# 8 REDO01.LOG -> seq# 9 --> current ==> Now we compare the seq# of the missing archived log with the seq# of the oldest redo log found on disk: seq# of the oldest online redo log is 7 seq# of the lost archived log is 6 ==> seq# of the oldest online redo log is greater than the one of the missing log Only, at this time can we say that we found a real gap that cannot be skipped on recover. ==> COMPLETE RECOVERY CANNOT BE DONE!!! ==> the last seq# that can be applied is seq# 5 2. Evaluate this information to be able to understand the recovery errors Listed above, we found the following situation on disk after crash: * last backed up seq# was 2 * last cataloged archivelog was seq# 8 * last archived log on disk was seq# 5 * oldest online redo log on disk was 7 * gap found * evaluation: * seq# 1 2 -> in backupset (RC_BACKUP_REDOLOG) * seq# 1 2 3 4 5 6 7 8 -> CATALOGED archived log (RC_ARCHIVED_LOG) * seq# 3 4 5 7 8 -> archived on disk * seq# | 7 8 9 -> CATALOGED online redo logs on disk (RC_REDO_LOG) * V * real gap * seq# 6 -> archived, CATALOGED, BUT MISSING (lost) * ^^ sequence 6 is KNOWN to RMAN * ^^ sequence 6 was archived on disk, was CATALOGED but was NOT BACKED UP * ^^ sequence 6 IS LOST and is the GAP in the sequence numbers 3. Explain the errors on recovery and interpret the RMAN errorstack A complete recovery is NOT possible, and the reason is not related to RMAN limitations. The situation is to be handled in the same way if the current controlfile is lost or not. If you start complete recovery RMAN will identify the gap and fail with the following error. We reproduce this using a backup controlfile, but as mentioned before the same error is raised using the current controlfile. RMAN>run { allocate channel d1 type disk; restore controlfile; restore database; sql 'alter database mount'; recover database; sql 'alter database open resetlogs'; } **** interpreting the RMAN errorstack (read from bottom up) **** RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-03013: command type: recover(4) RMAN-06053: unable to perform media recovery because of missing log ^^^^^^^^^^ RMAN-06025: no backup of log thread 1 seq 6 scn 843036 found to restore **** ^^^^^^^ **** ^^ sequence 6 is known to RMAN, this is why RMAN can identify the gap **** and reports the file as 'missing' **** ^^ sequence 6 was archived on disk, was CATALOGED, but NOT BACKED UP **** ^^ RMAN searches for a backup for seq# 6 but cannot find any errors We check the alert log: ORA-279 signalled during: alter database recover if needed start using back... Wed Mar 28 10:52:26 2001 alter database recover cancel Recovery was cancelled. RMAN inspected the disk to see if the known archived logs are available, identified the gap at this step, went to the backup to search if it can be restored, and because no backup was found for this sequence, aborted recovery. 4. Handle the reproduced error accordingly The only way to handle this is to start INCOMPLETE recovery until the missing sequence number. We do not need to 'catalog archived logs' because all archived log files up to the missing sequence are already known to RMAN. 5. Solutions for Case 2. We have the same solution independent of the controlfile used (current or backup controlfile). The following example uses a backup controlfile, and with the 'set until' clause directs RMAN to start incomplete recovery - the last log seg# applied will be 5: run { SET UNTIL logseq = 6 thread = 1; # dynamic coded allocate channel d1 type disk; restore controlfile; restore database; sql 'alter database mount'; recover database; sql 'alter database open resetlogs'; } This script could be created dynamically via an OS shell script. Case 3 ====== Online redo logs that are not current are lost, but CATALOGED archived logs with the same seq# are ON DISK. We can get the following errors only if we use the current controlfile RMAN-11001: Oracle Error: ORA-00283: recovery session canceled due to errors ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: 'D:\816\ORADATA\ORA816\REDO01.LOG' ORA-27041: unable to open file OSD-04002: unable to open file Below is the worked example that explains the situation for this error, how to evaluate and resolve it. 1. Collect the information you need to evaluate the situation 1.1 Find the database ID and the current database INCARNATION Query RMAN SVRMGR> select * from rc_database_incarnation; DB_KEY DBID DBINC_KEY NAME RESETLOGS_ RESETLOGS CUR PARENT_DBI ---------- ---------- ---------- -------- ---------- --------- --- ---------- 1 1519956463 2 ORA816 782306 14-DEC-00 NO 1 1519956463 12 UNKNOWN 782197 14-DEC-00 NO 1 1519956463 258 ORA816 842983 27-MAR-01 NO 2 1 1519956463 364 ORA816 843037 28-MAR-01 NO 258 1 1519956463 423 ORA816 843119 30-MAR-01 NO 364 1 1519956463 474 ORA816 843170 30-MAR-01 NO 423 1 1519956463 535 ORA816 863254 30-MAR-01 NO 474 1 1519956463 565 ORA816 883341 30-MAR-01 YES 535 We only have one database registered in the RMAN catalog The current incarnation is DBINC_KEY = 565 1.2 Find the seq# of the CATALOGED archived log files This is not needed because we assume all archived logs on disk are cataloged. We should always catalog all archived logs from disk up to the gap (if any) to simplify the process of automatization. The errors raised when this is not done are as described in Case 1. 1.3 Find the seq# of the BACKED UP archived log files (we assume that all archived log files needed to make the last backup consistent were backed up) Query RMAN svrmgrl> select i.DBID,b.DB_KEY,b.DBINC_KEY,b.DB_NAME,SEQUENCE#,b.FIRST_CHANGE#, b.NEXT_CHANGE#,b.COMPLETION_TIME,b.STATUS from RC_BACKUP_REDOLOG b, rc_database_incarnation i where b.DBINC_KEY = i.DBINC_KEY and b.DBINC_KEY=565 and i.DBID=1519956463 order by SEQUENCE#; DBID DB_KEY DBINC_KEY DB_NAME SEQUENCE# FIRST_CHAN NEXT_CHANG COMPLETIO S ---------- ---------- ---------- -------- ---------- ---------- ---------- --------- - 1519956463 1 565 ORA816 1 883341 883379 31-MAR-01 A 1519956463 1 565 ORA816 2 883379 883399 31-MAR-01 A 1519956463 1 565 ORA816 3 883399 883401 31-MAR-01 A 1519956463 1 565 ORA816 4 883401 883403 31-MAR-01 A 1519956463 1 565 ORA816 5 883403 883405 31-MAR-01 A 1519956463 1 565 ORA816 6 883405 883407 31-MAR-01 A 1519956463 1 565 ORA816 7 883407 883409 31-MAR-01 A 1519956463 1 565 ORA816 8 883409 903411 31-MAR-01 A 1519956463 1 565 ORA816 9 903411 903450 31-MAR-01 A The last backed up archivelog has seq# 9. 1.4 Find the seq# of the archived log files available on disk Inspect all archived log destinations, search for possible gaps in the sequence numbers D:\816\ORADATA\ora816\archive>ls -lrt|grep ARC -rw-rw-rw- 1 user group 1024 Mar 31 20:08 ARC00010.001 -rw-rw-rw- 1 user group 1024 Mar 31 20:08 ARC00011.001 -rw-rw-rw- 1 user group 1024 Mar 31 20:08 ARC00012.001 -rw-rw-rw- 1 user group 1024 Mar 31 20:11 ARC00013.001 -rw-rw-rw- 1 user group 1024 Mar 31 20:11 ARC00014.001 In this example: ==> the first seq# on disk is seq# 10 ==> the last seq# on disk is seq# 14 ==>there are gaps in the sequence numbers on disk ==> last backed up seq# was 9 ==> there are no gaps in the seq# between the last backed up seq# and first the seq# on disk ==> we need check if online redo logs are missing 1.5 Find the seq# of the online redo logs available on disk Inspect all online redo log destinations D:\816\ORADATA\ora816>ls -l|grep REDO -rw-rw-rw- 1 user group 1049088 Mar 27 21:53 REDO01.LOG -rw-rw-rw- 1 user group 1049088 Mar 27 21:53 REDO03.LOG We have 3 redo log groups (each having one member) ==> REDO02.LOG is missing ==> we check the alert log as described in Part I, and find the seq# for the redo logs REDO03.LOG -> seq# 14 REDO02.LOG -> seq# 13 --> oldest redo log cannot be found on disk REDO01.LOG -> seq# 15 --> the current log is on disk ==> we check if we have gaps between the oldest available redo log and the last archived log the seq# of the last archived log is 14 and greater than the seq# of the oldest online redo log ==> we have no gaps here ==> for each missing online redo log we check if a archived log with the same seq# exists for seq# 13 there is no redo log available on disk, but the archived log with the same seq# is available on disk ==> the current log is also found on disk ==> we have no real gaps in the seq# ==> COMPLETE RECOVERY CAN BE DONE!!! 2. Evaluate this information to be able to understand the recovery errors In the example listed above we found following situation on disk after crash: * last backed up seq# was 9 * last archived log on disk was seq# 14 * oldest online redo log on disk was 13 ARCHIVED logs : ... 9 10 11 12 13 14 seq# --------------------------------| ONLINE redo logs: 14 15 |-----------| | V V CURRENT lost REDO02.LOG seq# 13 We found NO GAPS, but one REDO LOG that is not current and IS MISSING. There is an archived log file with the same seq# available on disk. 3. Explain the errors on recovery and interpret the RMAN errorstack A complete recovery is possible because all data needed is available on disk or can be restored from the backup. But recover will complete successfully only if it is started using the backup controfile. RMAN behaves differently when using a backup controlfile or a current controlfile: When USING BACKUP CONTROLFILE RMAN is auto-inspecting the online log destinations searching for the available online redo logs, and registers the sequences for all of the logs found on disk. They then become 'known' to RMAN. RMAN also searches for all cataloged archived logs on disk and checkes if they are available. This can be seen in the RMAN logfile: RMAN-03022: compiling command: recover(4) RMAN-06050: archivelog thread 1 sequence 11 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00011.001 RMAN-06050: archivelog thread 1 sequence 12 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00012.001 RMAN-06050: archivelog thread 1 sequence 13 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00013.001 ^^^^ archived version of REDO02.log found RMAN-06050: archivelog thread 1 sequence 14 is already on disk as file D:\816\ORADATA\ORA816\REDO03.LOG ^^^^ online redo for this seq found RMAN-06050: archivelog thread 1 sequence 15 is already on disk as file D:\816\ORADATA\ORA816\REDO01.LOG ^^^^ online redo for this seq found RMAN-03023: executing command: recover(4) This way RMAN has a list of available files for each sequence and when the recovery process prompts for the next seq#, RMAN supplies one of the available log files: the online redo or if this is missing, the archived log file. Recovery completes successfully. This behaviour is more proficient than the server manager recover. This proves the automatization possibilities for RMAN. When USING CURRENT CONTROLFILE RMAN is not auto-inspecting the online log destinations. Only the cataloged archived logs are checked. This can be seen in the RMAN logfile. RMAN-03022: compiling command: recover(4) RMAN-06050: archivelog thread 1 sequence 11 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00011.001 RMAN-06050: archivelog thread 1 sequence 12 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00012.001 RMAN-06050: archivelog thread 1 sequence 13 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00013.001 ^^^^ archived version of REDO02.log found RMAN-06050: archivelog thread 1 sequence 14 is already on disk as file D:\816\ORADATA\ORA816\ARCHIVE\ARC00014.001 RMAN-03023: executing command: recover(4) If you start complete recovery using the current controlfile the following errors are raised: RMAN> run { allocate channel d1 type disk; restore database; sql 'alter database mount'; # mount the current controlfile recover database; } **** interpreting the RMAN errorstack (read from bottom up) RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure during compilation of command RMAN-03013: command type: recover RMAN-03006: non-retryable error occurred during execution of command: recover(4) RMAN-07004: unhandled exception during command execution on channel default RMAN-10032: unhandled exception during execution of job step 1: ORA-00283: recovery session canceled due to errors RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile 'D:\816\ORADATA\ORA816\ARCHIVE\ARC00012.001' RMAN-11001: Oracle Error: ORA-00283: recovery session canceled due to errors ORA-00313: open failed for members of log group 3 of thread 1 ORA-00312: online log 3 thread 1: 'D:\816\ORADATA\ORA816\REDO02.LOG' ORA-27041: unable to open file OSD-04002: unable to open file RMAN does not auto-inspect the log destination to search for the online redo logs and catalog their seq#. The recovery process will always request the online redo logs and not the archived log with the same seq#. The last archived log applied is ARC00012.001. You can check this looking in the alert log. If you get this error you need to check first if an archived log for seq# 13 is available on disk. The error does not mean you have a real gap in the sequences. If you find the archived log, you can restart recovery using server manager and manually apply this archived log. Another option is to restart RMAN recovery using backup controlfile as explained before. 4. Handle the reproduced error accordingly RMAN can handle this automatically if you start complete recovery using the backup controlfile. 5. Solutions for Case 3. The solutions depend on two situations. CURRENT CONTROLFILE USED: ==> RMAN does not handle this situation automatically Neither does server manager. You need to restart recovery from server manager and manually apply the archived log file instead of the missing online redo log. BACKUP CONTROLFILE USED: ==> RMAN can handle this situation automatically Start complete recovery using backup controlfile. RMAN will supply the needed automatism regarding the application of the available archived log files instead of the missing online redo log files. Case 4 ====== Only the current redo log is lost. We do not perform all steps done as in the previous cases because they don't change. We assume that after all checks were done, we find the following situation on disk ==> the only log files missing is the current log file with seq# 15 ==> all logs up to this seq# are archived on disk and cataloged ==> this is the only gap in the log sequence We need to do incomplete recovery to handle this situation. The only reason we discuss this case is to explain the errors that would be raised if you do complete recovery, by mistake. Again, the errors would be different, depending on the use of the current or backup controlfile. When starting complete recovery USING BACKUP CONTROLFILE you get following errorstack: RMAN-08060: unable to find archivelog RMAN-08510: archivelog thread=1 sequence=15 ^^^^^^^^^^^ RMAN-03026: error recovery releasing channel resources RMAN-08031: released channel: d1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure during compilation of command RMAN-03013: command type: recover RMAN-03006: non-retryable error occurred during execution of command: recover(4) RMAN-07004: unhandled exception during command execution on channel default RMAN-20000: abnormal termination of job step RMAN-06054: media recovery requesting unknown log: thread 1 scn 903462 ^^^^^^° NOTE: this is the same error you get when an archived log file is need for recovery that is NOT BACKED UP and NOT CATALOGED RMAN searches for the online redo logs, but this current log file is not found on disk, so RMAN does not know about its existence. No archived log file for this seq# 15 ever existed, so this seq# is completely unknown to RMAN. This is the reason for this error. You get this error for two reasons: - when an archived log files need for recovery was not cataloged - when you do complete recovery using backup controlfile and the current log is lost. When starting complete recovery USING CURRENT CONTROLFILE you get following errorstack: RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure during compilation of command RMAN-03013: command type: recover RMAN-03006: non-retryable error occurred during execution of command: recover(4) RMAN-07004: unhandled exception during command execution on channel default RMAN-10032: unhandled exception during execution of job step 3: ORA-00283: recovery session canceled due to errors RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile 'D:\816\ORADATA\ORA816\ARCHIVE\ARC00003.001' RMAN-11001: Oracle Error: ORA-00283: recovery session canceled due to errors ORA-00313: open failed for members of log group 2 of thread 1 ORA-00312: online log 2 thread 1: 'D:\816\ORADATA\ORA816\REDO02.LOG' ORA-27041: unable to open file OSD-04002: unable to open file NOTE: This is the same error you get when you do complete recovery using current controlfile and one of the online redo logs that was not current was lost. When the current controlfile is used the online redo logs will be applied. The recovery can complete ONLY if the current redo log is applied. There is no archived version of this log. Recovery takes another code path when the current controlfile is used. This is the reason for the different errors. 5. Solutions for Case 4. Start incomplete recovery until seq# of the missing online redo log file. Part III Sample RMAN scripts for backup and recover used in the tests NOTE: in version 8.0.x the 'restore controlfile' command needs to be followed by the 'replicate controlfile' command. In 8.1.x the controlfile is implicitly replicated with 'restore controlfile'. INCOMPLETE RECOVERY script using backup controlfile (8.1.x): run { SET UNTIL logseq = 6 thread = 1; # step 0 ask for incomplete recovery last seq applied 5 allocate channel d1 type disk; restore controlfile; # step 1 restore a backup controlfile restore database; # step 2 restore the datafiles sql 'alter database mount'; # step 3 mount the backup controlfile and recover recover database; sql 'alter database open resetlogs';# step 4 when using backup controlfile or set until } NOTE:This script will start the recovery process in the background with following command (pasted from the alert log file): Wed Mar 28 20:57:18 2001 alter database recover if needed start until cancel using backup controlfile ^^^^^^^^^^^^ ^^^^^^^^ COMPLETE RECOVERY script using backup controlfile (8.1.x): run { allocate channel d1 type disk; restore controlfile; # step 1 restore a backup controlfile restore database; # step 2 restore the datafiles sql 'alter database mount'; # step 3 mount the backup controlfile and recover recover database; sql 'alter database open resetlogs';# step 4 when using backup controlfile or set until } NOTE:This script will start the recovery process in the background with following command (pasted from the alert log file): Wed Mar 28 20:57:18 2001 alter database recover if needed start using backup controlfile ^^^^^ ^^^^^^ COMPLETE RECOVERY script using current controlfile (same for 8.1.x and 8.0.x): run { allocate channel d1 type disk; restore database; # step 1 restore the datafiles sql 'alter database mount'; # step 2 mount the current controlfile and recover recover database; } NOTE:This script will start the recovery process in the background with following command (pasted from the alert log file): Tue Mar 27 22:05:25 2001 alter database recover if needed start ^^^^^^ This is similar to svrmgrl> recover database; BACKUP script for full database backup: The backup order is very important: run { allocate channel ch1 type disk; backup full database format 'D:\backup_%p_%s_%u.%d'; # step 1 includes current controlfile sql 'alter system archive log current'; # step 2. backup the archived log files needed to make above backup consistent backup archivelog all delete input format 'D:\backup\al_backup_%p_%s_%u.%d'; } Part IV 9i enhancements related to automatic recovery with RMAN In 9i, on complete recovery RMAN will auto-inspect all known archived log destinations, catalog the archivelogs found and continue with recovery. The fix doesn't care about backup/current controlfile. If a archivelog is not found in controlfile, then it auto-inspects and catalogs it. This enhancement request was reported in . This auto-inspection is done in order to eliminate manual intervention regarding the uncataloged archived log files available on disk. This is not needed when the current controlfile is used (as on recover an implicit resync can be done that catalogs all the archived logs). Furthermore, when using backup controlfile, RMAN is able to auto-inspect the online log destinations to look for missing online redo logs, and can apply the archived logs instead. So it seems that complete recovery using backup controlfile is 'fully automatized' in 9i. This enhancement request was reported in new . RELATED DOCUMENTS ~~~~~~~~~~~~~~~~~ RMAN-06026 RMAN-06025 restore archivelogs seperately RMAN-6025 RMAN-6026 During Restoration of Archive Logs RMAN-6026 RMAN-6023 during restore RMAN-6023 when duplicating a databas