RMAN RECOVERY

孤街浪徒 提交于 2019-12-29 02:59:04

Data Recovery Advisor

Complete recovery from data file loss with RMAN

Incomplete recovery

Auto backup of control file

Using image copy for recovery

Block recovery

 

 

 

Data Recovery Advisor

DRA is data recovery advisor. It is capable of generating scripts to repari damage of data files and control files. It cannot repair spfile or redo logs. DRA is dependent on the Health Monitor and Automatic Diagnose Repository. The information the Healthy Monitor gather will be stored in the ADR.  And DRA will use this information to generate scripts. 

The health monitor and the ADR

The health monitor is a set of checks that run automatically when certain error happen or manually in response to DBA`s instruction. The check results is stored in the Automatic Diagnose Respository(ADR). ADR is not in the database, it is on the file system. Because some error may lead to the database down, so putting the check result outside of the database makes us can access them even the database is not accessable. The initialization parameter DIAGNOSTIC_DEST shows the location of ADR.

Different health monitor checks can run at different stages.

  • no mount mode. Only the "DB Structure Integrity Check" can run. And it can only check the integrity of the control file.
  • In mount mode. The "DB Structure Integrity Check" will check the integrity of control file, and the online redo log files and the data file headers. The "Redo Integrity Check" will check the online and archive log files for accesibility and corruption.
  • In open mode. it is possible to scan and check every data block for corruption and check the integrity of data dictionary and undo segment.

To manually run the health monitor you can use two interfaces--SQL*PLUS invoke the DBMS_HM PL/SQL package or Database Control. Therefore the database must be open to manually run the health monitor.

The capabilities and limitation of the DRA

The DRA can do nothing unless the instance is in nomount status or higher. So it can not assistant if there is something wrong with the initialization parameter file. For current release 11gr2, the DRA can only work on single instance database, so if a RAC database need repire, you can mount it in single instance mode, repire it with DRA then re-open with RAC mode.

Use the DRA

Let`s see an example of using DRA first.

In this exercise, you will cause a problem with the database, and use the
DRA to report on it.
1. From an operating system prompt, launch the RMAN executable:
rman target /
2. Confirm that there is a whole full backup of the SYSAUX tablespace:
list backup of tablespace sysaux;
If this does not return at least one backup set of type FULL, create one:
backup as backupset tablespace sysaux;
3. Shut down the instance and exit from RMAN:
shutdown immediate;
exit;
4. Using an operating system utility, delete the datafile(s) for the SYSAUX
tablespace that were listed in Step 2. If using Windows, you may have to
stop the Windows service under which the instance is running to release
the Windows file lock before the deletion is possible.
5. Connect to the database with SQL*Plus, and attempt a startup:
startup;
This will stop in mount mode, with an error regarding the missing file. If
using Windows, make sure the service has been started.
6. Launch the RMAN executable and connect, as in Step 1.
7. Diagnose the problem:
list failure;
This will return a message to the effect that one or more non-system datafiles
are missing.
8. Generate advice on the failure:
advise failure;
This will suggest that you should restore and recover the datafile, and generate
a repair script. Open the script with any operating system editor, and study its
contents.
View Code

The flow of using DRA is like below:

  • assess data failure. The health monitor, running reactively or on demand, will write error information into the ADR.
  • list failure. As we did in the example, use list command to list errors. 
  • advice on repire. DRA will generate scripts to repire. 
  • execute the scripts. To repire.

There is one thing need to know. The third step-advice will only generate advice on the failures listed in the previous list step. So you have to list first. In our example, we listed the error and generate advice with the two commands "list failure;" and "advice failure;". Now we can use "repire failure;" to repire run the scripts generated before.

 

Complete recovery from data file loss with RMAN

Recovery in noarchive log mode

Although this paragraph is under the big title complete recovery. I have to say in this situation complete recovery may be difficult. In noarchive log mode, if you lose datafiles your only option is to restore the whole database. After this, if you are lucky, the redo logs from the last backup still exist, not being overwriten, then you may proceed to do a complete recovery. But if the restored backup is done a long time ago, the redo logs need for roll forward do not exist, you may need to do below.

  • start the database in mount database.
  • alter database clear logfile group <group name>

After the second step, all log groups are re-created you can open the database. The RMAN commands used in here is below

shutdown abort;
startup mount;
restore database;
alter database open resetlogs;
View Code

But if you are using incremental backups, then although you may not be able to recovery the database to the latest, you still can roll forward it as much as possible with below follow.

shutdown abort;
resotre database;
recover database noredo;
alter database open resetlogs;
View Code

Recovery in archive log mode

Recovery from loss of nocritical files

In an Oracle database, the datafiles that make up the SYSTEM tablespace and the currently active undo tablespace (as specified by the UNDO_TABLESPACE parameter) are considered to be “critical.” Damage to any of these will result in the instance terminating immediately. Damage to the other datafiles, which make up tablespaces for user data, will not as a rule result in the instance crashing. Oracle will take the damaged files offline, making their content inaccessible, but the rest of the database should remain open. If your backup are made by RMAN. Basically the recovery will be very simple.

For example, if one of the nocritical datafile datafile 5 is deleted. The who restore and recovery process will be like below.

run
{
restore datafile 5;
recover datafile 5;
sql 'alter datafile 5 online';
}
View Code

Some internal things you need to know is that if you have incremental backup and archive logs at the same time. RMAN will use incremental backup in prior to redo logs. Because it is faster.

Recovery from loss of critical files

Usually the critical files should be located in RAID storages. But there is still situations critical files can be wrong. The process to recovery is the same as previous.

  • startup mount
  • restore 
  • recover 
  • alter database open

Incomplete recovery

We assume you know why we need incomplete recovery. But one thing need to be clear that incomplete is not flashback. Flashback can be happen on certain tables but incomplete is happen on the whole database. The process for incomplete recovery is as below:

  • mount the database
  • restore all datafiles
  • recover the datafiles until a certain point
  • open database with reset logs

See these steps one by one. Firstly, the incomplete recovery must be down in mount mode. Because you actually recovering the whole database. Secondly restore all the datafiles. You dont need to restore them to the same point, but they need to be earlier than the point you want to be recoveied. Third steps is sample just recover to a certain point. Fourth steps open the database with reset logs.

There is a special stiuation that you may need to recover the control files first before to do incomplete recover. One situation is control files missing. The other situation can be, for example, you delete a tablespace then you want to recover the database to the timepoint before the deletation. A problem is that the control file already record the delete operation. If you use the current control file, it will not know how to recover the tablespace. So you need a previous control file copy.

Here is a example for incomplete recovery

1 run {startup mount;
2 set until time = "to_date('27-10-08 10:00:00','dd-mm-yy hh24:mi:ss')";
3 restore database;
4 recover database;
5 alter database open resetlogs;}
View Code

Auto backup and restore of controlfile

Control file is important not only to the current database but also to the RMAN because the control file can be used as RMAN repository. So we need to configure control file auto backup.

configure controlfile autobackup on;

After you configure this, every rman operation will conlude a backup set containning control file and spfile. You can use below command to restore from the control file.

restore from controlfile autobackup;

There is a very important thing here. Think about this question: The rman repository is control files. Here we are trying to restore the control file. It is logically impossible. We can not restore the control file since we need the control file to do this. So how does this command work? Actually this command will not need RMAN repository. It will to go to the well-known location( usually the fast recovery area) to find the well-known file(the well known file name is based on the DBID). So the dbid is very important you shoud always put it in your most basic documention. So as long as we have the spfile, we can find the well know location and find the well-known backup set.

If the spfile has also been lost, start the instance with a dummy initialization file: a pfile with just one parameter, DB_NAME. Then connect with RMAN, and issue these commands, substituting your DBID number for that given:

set dbid 1234567890;
restore spfile from autobackup;

The restore of an spfile will be to the default location, in $ORACLE_HOME/dbs for Unix.  Then restart the instance in nomount mode, which will use the restored spfile, and restore the controlfile. Now we can restore because we know the well-known location(from spfile) and well-know name(dbid). Mount the controlfile, and RMAN will then have access to its repository and can locate and restore the datafile backups.      

The commands restore controlfile from autobackup and restore spfile from autobackup can be executed in nomount mode. All other RMAN commands can be executed only in mount or open mode. Because the two are based on DBID.

This script accomplishes the complete restore and recovery of a database, assuming that everything was lost:

1 run{startup nomount pfile=dummy.pfile;
2 set dbid=1196323546;
3 restore spfile from autobackup;
4 shutdown abort;
5 startup nomount;
6 restore controlfile from autobackup;
7 alter database mount;
8 restore database;
9 recover database;
10 alter database open resetlogs;}

 

Using image copy for recovery

If your backup strategy includes creating image copies as well as (or instead of) backup sets, then you have another option available for restore operations: do not restore at all. As an image copy is byte-for-byte the same as the source datafile, it can be used immediately if it is still available on disk. All that is necessary is to tell the database the location of the image copy, and then recover the copy. This can result in massive time savings when compared with the delays involved in extracting a datafile from a backup set.
To facilitate the use of image copies, use the following RMAN command:

RMAN> backup as copy database;

This command will copy every datafile to the flash recovery area, as image copies.

To use an image copy, first take the original datafile offline (which will have happened automatically in many cases) and then update the controlfile to point
to the copy, and recover it. For example, if a copy has been made as follows:

RMAN> backup as copy datafile 4 format '/u02/df_copies/users.dbf';

Then it can be brought into use like this:

RMAN> run {sql 'alter database datafile 4 offline';
set newname for datafile 4 to '/u02/df_copies/users.dbf';
switch datafile 4;
recover datafile 4;
sql 'alter database datafile 4 online';}

This accomplishes a complete recovery, without actually needing to restore. The SWITCH command is equivalent to the ALTER DATABASE RENAME FILE command that can be executed in SQL*Plus. But there is one problem, after recovery the datafile being used now will be the one in the flash recovery area. This may not be what we wanted. You still need to copy it back to the datafile disk.

If the whole database has been copied to the flash recovery area then the whole database can be “restored” with one command:

RMAN> switch database to copy;

Another use of image copy is to update them by using incremental backup. This need a full backup as the start point then rolls the copy forward by applying incremental backups. The starting point need to be a complete copy of the database(which can be made while the database is open), then followed by creating incremental backups with syntax that will permit them to be applied to the copy.  Below two commands can accomplish this whole process:

run {
backup incremental level 1 for recover of copy with tag 'inc_copy' database;
recover copy of database with tag 'inc_copy' ;
}

The first time the script run, it will attempt to create a level 1 incremental backup. But since there is no level 0 backup to base, it will perform a level 0 backup instead. The syntax will make the backup as a copy instead of a backupset. The recovery command will fail on the first run because there is no incremental backup to apply. The second time the script run, the frist command will perform a level 1 incremental backup and the second will apply the incremental backup to the copy.

A strategry based on incrementally updated backups can result in very fast recovery times with minimal backup workload for the database. If the script runs daily, the worst case will be the copy are one day behind to the latest. But one thing is very important. You will need to backup the oldest copy and archive logs. Because if you do not have the oldest copy, you may not able to performa a incomplete recovery which is eralier than the latest backup.

 

Block recovery

The backup and recovery we were talking about is mainly about files. But there is possibilities that a range of blocks is broken. In this case, the datafile will remain online and users may not know this brokens unless they access these blocks. If a session try to access a broken datablock, it will hit error and throw an error message also record this error in alert log. Rman can also detect these broken datablock and can reparie them.

deteact broken data block

RMAN will detact broken datablokcs when it perform backup. Unless instructed RMAN will exit the backup process as soon as it hit one broken block. You can spedify a tolerance number for RMAN. By doing this RMAN will not exit the backup process when it hit an broken datablock. Instead, RMAN will record down the address of broken datablock and continue to backup. If broken blocks is more than the specified tolerance number, RMAN will exit the backup. Below command tell RMAN continue to work unless there are more than 100 broken data blocks.

RMAN> run {
set maxcorrupt for datafile 7 to 100;
backup datafile 7;}

The details of corrupt blocks are visible in two places. The view V$DATABASE_BLOCK_CORRUPTION shows the address of the cause of the problem: the datafile file number and block number. The address of the block in the backup is also visible in V$BACKUP_CORRUPTION for corruptions encountered by backup set backups, or in V$COPY_CORRUPTION if the backup were to an image copy.

By default, RMAN will always check physical corruption. RMAN can also be instructed to check for logical corruption, also known as “software corruption,” as

well. These checks will occur whenever a file is backed up, whether as an image copy or into a backup set. To override the defaults,

 

RMAN> backup nochecksum datafile 7;

 

will not check for physical corruption, but

RMAN> backup check logical datafile 6;

will check for logical as well as physical corruption.

RMAN block media recovery

The RMAN blokc media recovery mechanism is easy. RMAN get a list of correpted blocks and extract them from a good backup. Then apply archive log to them.

Below are the commands used for RMAN block media recovery, you can recovery a lot of blocks by one command.

RMAN> block recover datafile 7 block 5,6,7 datafile 9 block 21,25;

You can specify a backup if you know it is good

RMAN> block recover datafile 7 block 5 from backupset 1093;

You can use tag to specify

RMAN> block recover datafile 7 block 5 from tag monthly_whole;

If the broken blocks are too many, you can use below command to recovery.

RMAN> block recover corruption list until time sysdate - 7;

The corruption list means all the broken datablock RMAN detected. The until time used here is not incomplete recovery. It means to use a backup earlier than the specified time here.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!