Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
Database Servers
DB2InformixIngresMS SQLOraclePervasive.SQLPostgreSQLProgressSybase
Desktop Databases
FileMakerFoxProMS AccessParadox
General
General DB TopicsDatabase Theory
Related Topics
Java Development.NET DevelopmentVB DevelopmentMore Topics ...

Database Forum / Oracle / Oracle Server / January 2006

Tip: Looking for answers? Try searching our database.

Oracle: how to demonstrate successful restore?

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Stefan - 25 Jan 2006 14:48 GMT
What are various techniques to demonstrate successful restore in
Oracle?

for instance:  what kind of formal confirmation does oracle provide?
are there any sort of restore reports?  what kind of information do
they report?  are there any additional manual checks that a DBA can do
- maybe looking at time of system change numbers, or transaction times
etc...

???????

thanks
Daniel Fink - 25 Jan 2006 19:19 GMT
There are many ways to do this, it depends on what you are trying to
test. Just being able to open the database is one measure!

Here is an example of a very simple restore test for an incomplete
recovery:
Database is running in archivelog mode.
1) Perform backup of database
2) Update test table with sysdate (make sure to record the exact date
including minutes/seconds) and commit
3) Wait a certain number of minutes
4) Record sysdate
5) Update test table with new sysdate and commit
6) Shutdown database
7) Restore backups and roll forward to sysdate in step 4
8) Query test table and make sure the last updated date is the value in
step 2 and not the value in step 5

For a complete recovery, verify that the test table has a last updated
date found in step 5.

Regards,
Daniel Fink
Tiff - 26 Jan 2006 00:02 GMT
Daniel,

What a great idea!  I have been on a DBA team for several years and we
perform backups of all our databases (primary and secondary) daily.  In
this time, we have yet to need a recover (thankfully), but I told my
team lead, I won't feel confident of our recovery plan until I see it
in action.

What a great way to test a recovery.  Do you think it would be a good
test to create this test table and delete its datafile and try a
recover to get back just this missing table?  Will deleting this one
datafile affect the rest of the database or is this a test best
reserved for a development environment?

You can see I have no experience in disaster recovery.

Thanks,

Tiffany
Joel Garry - 26 Jan 2006 01:28 GMT
Tiffany:

You definitely need to have a test environment set up separate from
your production environment.  (Even then I've seen people screw up and
get fired because the front ends can look so similar).

If you have metalink access, type in "recovery scenarios" in the
knowledge browser.  Plenty of ideas scattered about there, especially
if you run across the scenarios they use in the classes (and note:
62385.1 is a pretty decent list of basic things to try).  Recovery is
_the_ most significant dba skill, and you are right to recognize that
you don't have valid backup procedures unless they've been tested.  You
need to know it cold for when you need it for real.

jg
Signature

@home.com is bogus.
http://www.rathergood.com/independent_woman/

Daniel Fink - 26 Jan 2006 04:42 GMT
Always, always, always test recoveries in a non-production environment.
It is a great idea to use a production backup to do so (kills two birds
with one stone). Need to refresh a development/test environment? How
about using the latest production backup?

Think about different recovery scenarios. Loss of a single table, loss
of a datafile, loss of a device, controller, system, data center, etc.
Work out how you would recover from each, how long it would take, how
much data would be lost, etc.

Presented for your consideration
"The responsibility of a DBA is not to back up the database...the
responsibility of the DBA is to recover the database!" (paraphrase of
Tim Gorman).

I recall a discussion at a user group meeting where a dba was telling
the story of a new tape drive in their backup system. Seems that there
was a slight miscalibration and the head would move a fraction of a
millimeter each time it wrote a new tape. Tapes would write
successfully, would be verified successfully...and could not ever be
read again!

I myself went through a situation where a bug caused the database to be
unrecoverable. Not fun!

Regards,
Daniel Fink
Tiff - 26 Jan 2006 16:20 GMT
Guys, thanks so much!

I will take the advice both of you have offered and will work on a
"Disaster Recovery" document and then take it to my lead to ask for the
chance to test out some of the scenarios.

I will make the Gorman quote my new mantra!

My goal is to finally get certified this year and I am certain these
hand on exercises will prove invaluable.

Thanks again!

Tiff

P.S.  I will try using a backup from Production to load our Dev
environment.  I always just load using export or sqlldr.  This will be
a "fun" experiment.
Mladen Gogala - 27 Jan 2006 04:38 GMT
> You can see I have no experience in disaster recovery.

Companies normally have DR tests, just like fire drills. Typically, those
tests go from primary and standby databases switching places all the way
to going to remote location and restoring 1.1TB database at a spare
location which can be provided by companies like IBM or SunGuard. It is
important to understand that DR document should include the possibility of
failure of any component, like router, name server, firewall, application
server, web server, VPN server or LDAP server. Without the name server or
router your clients will not be able to find the database server, even if
the latter is perfectly operational.

The first thing to decide when writing a DR document is how far do you
want to go and what do you want to protect the company from? Are we
talking about major malfunction (like the 2003 big power outage) or total
loss of a location as after Katrina or 9/11? What is the cost of the
downtime and what are the time constraints? If there is an outage, how
long can the company afford to stay down? If the allowed downtime is long
enough, you can get away with restoring from backup in the moment of a
problem or simply activating a standby which can be located at the other
end of the same building.
The second thing to decide is how much data can you afford to lose? If it
is a dating database that keeps "compatibility points" and 10 million
addresses of ineligible and undesired bachelors, you can afford to lose
more data then if you are operating an on-line banking database. Last but
not least is how much does the company want to spend on data protection?
DR plans are frequently the first casualty of cost cutting. Suddenly,
non-technical people (CFO, for instance) bring in this wonderful sales guy
from EMC telling you that RAID-5 is just fine and is equally as fast as
RAID 1+0 or that those nasty old PA RISC boxes can easily be replaced by
nice new thingys with quad-Opteron motherboard and running RH. When you
find an unknown coffee bags in the kitchen in the place where Maxwell
House coffee bags used to be, it's time to get your resume on the Dice.
Maxwell House coffee is critical for DR plans as it enables your DBA to
perform database recovery at 03:30 AM. It's good to the last drop. CIO is
usually extremely touchy when it comes to signing checks, especially for
disaster recovery. Cheap solutions are abundant and normally do not work
without an extreme effort and many hours of pointless toiling for which
nobody will thank you. DR is primarily a business decision for which you
will need support of the senior management of the company.

Why am I telling you all this?
1) You are a junior DBA person and obviously are in charge of the
  company's DR plan. That is usually a grave mistake and a sign of the
  company trying to save money on the wrong thing. Nothing personal,
  but as a long time DBA, I don't think that a DR plan should be
  tasked to a junior person. That is the job for your team lead.
2) You don't have a DR drill. Fire drills are required by law, but
  not the DR drills. To be effective, DR drill must be performed at
  least once a year. A&E, the company that I had a consulting gig with
  for a few months, performed DR drills monthly. They have two locations
  (NYC and Stamford, CT) and they switch roles of the primary and standby
  databases, name servers, routers and they go through the whole 9 yards.
  Now that's the company conscious of the data security. You don't even
  know whether your backup can be restored. That leads me to the
  conclusion that RESTORE DATABASE VALIDATE is not a part of your weekly
  backup routine. (This RMAN command scans the last backup and checks
  whether it can be restored)

If your company is trying to save some money and if you, as a junior DBA
person, are in charge of DR plan, then you are what is commonly known as a
"scapegoat". Anything happens, and it's your job on the line. For a DBA,
an item like "I was fired when I lost company's production database"
doesn't look too nice on the resume and somewhat diminishes chances of
getting hired again as a DBA. Try clarifying the points mentioned above
with the management and if you don't get clear and satisfactory answers,
run like hell.

I humbly apologize if this sounds harsh and cynical, but I assure you,
it's a cold world out there. You are free to hate me if you so desire.

Signature

http://www.mgogala.com

Frank van Bortel - 27 Jan 2006 20:38 GMT
Hear! Hear!

But you will be surprised how few companies actually
test their DR plan...

Reminds me of a former employer, that put great effort
in designing a monkey-proof DRP, but never tested it,
only to find out no tapes could be read when hell broke
loose.

Took us (5 programmers) a week to get some 98% of
the data back. Go figure: marketing, planning,
sales, billing and MRP not available for a week.
Signature

Regards,
Frank van Bortel

Top-posting is one way to shut me up...

Mladen Gogala - 28 Jan 2006 05:05 GMT
> But you will be surprised how few companies actually
> test their DR plan...

It's a business decision. What surprises me is how few people
actually verify their backups. Here is it all it takes:
RMAN> restore database validate;

Starting restore at 27-JAN-06
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=64 devtype=DISK

channel ORA_DISK_1: starting validation of datafile backupset
channel ORA_DISK_1: reading from backup piece /oradata/back/back_01h9uj59_1_1.bkp
channel ORA_DISK_1: restored backup piece 1
piece handle=/oradata/back/back_01h9uj59_1_1.bkp tag=TAG20060127T232632
channel ORA_DISK_1: validation complete, elapsed time: 00:01:06
Finished restore at 27-JAN-06

My database wasn't actually restored, only the backup was validated.
Here is the problem, visible in sar output while the validation is running:
$ sar -u 5 10
Linux 2.6.12-1.1381_FC3 (medo.noip.com)         01/27/2006

11:51:53 PM       CPU     %user     %nice   %system   %iowait     %idle
11:51:58 PM       all     99.00      0.00      1.00      0.00      0.00
11:52:03 PM       all     99.00      0.00      1.00      0.00      0.00
11:52:08 PM       all     99.00      0.00      1.00      0.00      0.00
11:52:13 PM       all     99.00      0.00      1.00      0.00      0.00
11:52:18 PM       all     16.03      0.00      1.40     82.57      0.00
11:52:23 PM       all      2.59      0.00      1.60     95.81      0.00
11:52:28 PM       all      1.61      0.00      0.60     97.79      0.00
11:52:33 PM       all      0.40      0.00      0.80     98.80      0.00
11:52:38 PM       all      0.40      0.00      0.20     99.40      0.00
11:52:43 PM       all      2.20      0.00      1.40     96.40      0.00
Average:          all     41.94      0.00      1.00     57.06      0.00

Validation was active during the first few snapshots. It's extremely
expensive operation, especially if the backupset is compressed. Without
the compression, it takes approximately 40% of the CPU, but it is an
expensive operation, even on the much bigger machines then my measly PC.
Probably people think that not having a backup or having a bad backup is
cheaper then verifying it? You are a DBA in the Big Easy, no snow, no
ice storms, what could ever happen to your database? It's unlikely that
it will ever be sleeping with the fishes, to use the term from "Godfather"?
Why verify?

Signature

http://www.mgogala.com

JEDIDIAH - 30 Jan 2006 17:08 GMT
>> But you will be surprised how few companies actually
>> test their DR plan...
[quoted text clipped - 15 lines]
>
> My database wasn't actually restored, only the backup was validated.

    You trust Oracle far too much. The backup isn't validated until that
database is running again and the users applications are successfuly using it.
There's nothing like actually doing something to prove that it can be done.

[deletia]

Signature

    If you think that an 80G disk can hold HUNDRENDS of           |||
hours of DV video then you obviously haven't used iMovie either.  / | \

Joel Garry - 30 Jan 2006 22:02 GMT
> >> But you will be surprised how few companies actually
> >> test their DR plan...
[quoted text clipped - 19 lines]
> database is running again and the users applications are successfuly using it.
> There's nothing like actually doing something to prove that it can be done.

The backup is validated.  What is needed is the _restore_ to be
validated.  And clarity on whether we are talking about validating
procedures or actual restores.

Validating the backup shows that you have something that could be used
in a restore.  Even better is using the backup to restore to an
off-production environment.  There's always going to be some difference
between what you can test and reality, the amount of difference is
directly related to cost, that's where service level agreements and
management decision making come into play.

I think we've all seen questionable management decisions.  Technical
advisors need to be sure they have input to make these decisions more
reasonable.

jg
--
@home.com is bogus.
Game Over.
http://www.signonsandiego.com/uniontrib/20060130/news_lz1b30ac2.html
rcyoung - 28 Jan 2006 18:04 GMT
I find that most companies "fail" to do "real life" DR scenarios. Oh
they do what looks fine "on paper"...to fulfill some mandated
requirement..but nothing like a real life recovery.  You really need to
run through the whole process....including recalls of media that may be
stored "off site", using alternate tape drives, etc
Vince Laurent - 27 Jan 2006 16:47 GMT
The Oracle class on Backup and Recovery was one of the best I have
taken.  Not only do you learn how to deal with nearly 20 different
scenerios the lab actually tests this knowledge.  Our instructor would
cause a failure and you would have to figure out which of the 20 it
was.  Good labs.

>Daniel,
>
[quoted text clipped - 15 lines]
>
>Tiffany

-----------------------------------------------------
Come race with us!
http://www.mgpmrc.org
Mladen Gogala - 27 Jan 2006 01:12 GMT
> What are various techniques to demonstrate successful restore in
> Oracle?
>
> for instance:  what kind of formal confirmation does oracle provide?

You can call Oracle support and ask them to come and certify your
database. If your database is well protected, they'll give you
Backup Secured Enterprise or BSE certification. Call Oracle
support ask them about BSE certification of your database.

Signature

http://www.mgogala.com

dominica@gmail.com - 27 Jan 2006 02:46 GMT
Actually, I agree with Joel Garry and Daniel..and so on...
We could always backup the database in tape, but we might NEVER get it
back.
Sometimes, tape could be bad after leaving there for a while.
And I usually recommend every 5 months, every company should do at
least one
recovery test , even on a small DB .
And normally, an DBA should do/test recovery for different recovery
scenario.
(complete and in-complete recovery and full-db recovery or partial
recovery).

Guess what?
I just have to do 3 disasater recovery of my 3 production databases
last week
(the largest one is 450 GIg, smallest db is still 300 GIG).
Actually I do a partial-recover, I only recover the tablespace that I
want.
BUt that tablespace is still 100+ GIG.
Basically, one of the application has a BUG and nullify one important
column and
people don't realize the column is GONE until after 1 month.

So I have to get back the oracle hotbackup and archivelog from tape
from last month.

I am very stressful about it and stay up late for 3  nights in a row..
and recover all 3 databases in a TEMP environment and then
the developer could update back that column those those 13 million row
tables
from the RECOVERED-DBs.

The funny thing is , I don't do recovery that often, but my current
work,
lately has a lot of need to do recover.. even do logmining... to mind
the delete rows.
Though, it become very good experience.

Dominica
Mladen Gogala - 27 Jan 2006 06:45 GMT
> Actually, I agree with Joel Garry and Daniel..and so on...
> We could always backup the database in tape, but we might NEVER get it
> back.

That's called a black hole backup. It's essentially equivalent to
doing backup to /dev/null. On the plus side, you will never run
out of space, but you might have a problem with restoring it.

> Sometimes, tape could be bad after leaving there for a while.
> And I usually recommend every 5 months, every company should do at
> least one
> recovery test , even on a small DB .

5 MONTHS? Are you sure that the tape is ripe enough for a recovery
test after only 5 months? You must let sun flares and electromagnetic
storms to take their natural course, as Simon Trevaglia would say on
certain occasions. Restoring 5 months old backup makes a lot of business
sense, I'm sure that numerous business analysts would be grateful to you
for providing them less then half a year old database, but I'd rather
let that tape to mature for few more months before attempting to verify it.

> And normally, an DBA should do/test recovery for different recovery
> scenario.

What is a "recovery scenario"?  DBA has to be able to restore the
database. DBA does backup of the database, DBA does backup of archivelogs,
DBA does a full export of RMAN catalog database afterward, preferably by
using scheduled scripts. Tapes then go to the tape duplicator where they
are duplicated and stored in two different locations. If and when the need
arises, DBA restores the database, from the backup that is less then 24
hours old, if possible, and if it isn't, then from the newest available
backup. Testing recovery is done from the complete set of tapes, that's
it. In case of a failure, DBA has to know what to restore and where. If
you have a RAC with one or two standby databases, then you don't have a
failure. You lose one, you continue with another. The same database is
still available. Emergency restore to another server is done only when
everything is lost.
DR test is not a database recovery course to practice recovering
tablespace containing the precious EMP and DEPT tables. DR test is a
simulation of a disaster in which, typically, the whole data center is
lost. In other words, someone tells you that all your base are belong to
us and that you are on the way to destruction.

> (complete and in-complete recovery and full-db recovery or partial
> recovery).

I believe that management may have something to say about the partial
recovery. They might not be thrilled by missing data.

In addition to that, "BSE certification" was meant to bring a smile on
the faces of my Commonwealth friends, like Jonathan, Nuno and Niall. BSE
certification of a database would be especially popular type of request in
UK. DBA asking for such certification would have to be an Oracle
Certifiable Person, OCP for short.

Signature

http://www.mgogala.com

dominica@gmail.com - 27 Jan 2006 20:34 GMT
Hi Mladen,

1)
Don't worry,  I could do full recovery (if I want to).
I have my own reason for doing partial-recovery on only one tablespace.
I did not lose any data at all.

It is just because the developer want to see something in ONLY ONE
SINGLE TABLE that they have one column NULLIFLY.
(there is a long story and explanation why it is partial-recovery).

2) There is another reason why I restore one-month old backup
(this is a special requirement, long story again).
I recover to another TEMP-ENVIRONMENT, not the production one.

And for the "5 months" thing, I am not restoring 5 months-old backup.
I am just saying test recover one every few months is good.(like
every 5 months).
Of course, if you run hot-standby, you don't have that recovery
problem.
(I used to run hot-standby in another work place, but not my current
one).

Dominica
Joel Garry - 27 Jan 2006 21:41 GMT
Dominica wrote:

>Of course, if you run hot-standby, you don't have that recovery problem.

Yes, you have _other_ recovery problems.  You need to test standby
also, some people who have been burned switch it into readable mode
every day (or hour).  Beyond that, it needs to be fully tested
periodically with a full app switchover.

I found one place that was faithfully moving changed code to the
standby - but not changed shell scripts!

Another place I discovered was using a combination of backup software
and tape compression that would backup within the window, at the cost
of making it wayyyyy too slow to restore, the software would make the
tape hunt all over for each [I don't know what, smaller than a file],
making the restore go on for days, rather than just blasting it all
back onto the DR machine.   Which is why to this day I'm kinda weird
about wanting the occasional cold dumb backup.

Mladen, we hate you _because_ we love you!  :-)  (You ought to take
that write-up, generalize it, and add it to a faq or the dizwell wiki).
Good point making the distinction between doing a DR plan right and
learning recovery techniques.  I have fallen into the "trying to answer
their question and oversimplifying the answer until it is wrong" trap
in this thread.

jg
--
@home.com is bogus.
"People are always going to find a way around us."
http://www.signonsandiego.com/uniontrib/20060127/news_1n27tunnel.html
Mladen Gogala - 28 Jan 2006 03:53 GMT
> Mladen, we hate you _because_ we love you!  :-)  (You ought to take
> that write-up, generalize it, and add it to a faq or the dizwell wiki).

This is actually a very good idea and I will try to write down something.
Thanks. I've also been experimenting with HTMLDB and I found it ideal for
creating a capacity plan. I will write down some facts about the DR plans
and a capacity plan.
I recently signed up for Howard's site and am delighted by the quality.
It's a shame that Howard is no longer active on this forum.

Signature

http://www.mgogala.com

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2010 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.