What email address or phone number would you like to use to sign in to Docs.com?
If you already have an account that you use with Office or other Microsoft services, enter it here.
Or sign in with:
Signing in allows you to download and like content, and it provides the authors analytical data about your interactions with their content.
Embed code for: Coping With Failure v1.0
Select a size
Coping with Failure
An introduction to Disaster Recovery
40% - 75%
of businesses do not have a tested DR plan in place
Not all disasters would make
Data Corruption 2:
Obvious “disasters” aren’t the most likely…
"What was the cause(s) of your most significant disaster declaration(s) or major business disruption?"
Based on 94 global disaster recovery decision-makers and influencers
Source: Forrester/Disaster Recovery Journal November 2013 Global Disaster Recovery Preparedness Online Survey
A how to guide
4 steps to Disaster Recovery Planning
To recover you need to…
Know your business
Assess the risks
Document the plan
Test and review the plan
1. Know your business
What keeps the lights on in your business?
Understand the systems and processes that are critical to your business
Identify the key stake holders responsible for these systems and processes, and engage with them
Determine your RTO and RPO for these critical business systems
2. Assess the risks
Ask the “What if…” questions to tease out the risks to your business
Use available tools to assess the criticality of your business systems
The risk matrix is a useful guide for assessing risks
Assess the risks | The Risk Matrix
Very High – treat immediately
High – treat as soon as possible
Medium – treat when feasible
Low – do nothing
Very low – do nothing
3. Document the plan
A documented plan stops the panic
Makes sure people know there responsibilities
A documented plan will help you recover quicker
Keep the plan up-to-date
Keep a copy off site!
4. Test and review the plan
Put the plan in to practice
Make sure it works as expected
Modify it accordingly
Review it regularly
keep it up-to-date
of business have not validated the recovery readiness of critical suppliers, partners or other 3rd parties
Source: Forrester/Disaster Recovery Journal Business Continuity Preparedness Survey, Q4 2014. 175 BC decision makers and influencers
Make sure the readiness of your critical partners is factored in to your DR Plans
Typical disasters are more mundane than you think
Having a plan will allow you to handle disasters better
Ask the “What if…” questions when determining what’s critical to your business
Use tools like the probability matrix to identify the focus of your plan
Document the plan, share it, update it, and keep it safe!
Make sure that business critical partners have a DR plan, and that it works with yours.
We all have to be prepared to cope with failure, BUT when it comes to technology, failure it’s a hard thing to overcome.
For many businesses it can mean a loss of revenue and a damaged reputation, and it doesn’t just affect websites, it’s about your applications, network, data and connectivity.
Any interruption of services for a business can affect revenue, reputation and productivity.
Depending which research you look at – and there’s a lot – businesses without a DR plan range from 40 to 70%
Even at the lower end this is a baffling statistic.
I asked myself “Why aren’t businesses planning for disasters?”
I blame Hollywood for this inertia as peoples perception of what is a “Disaster” is skewed by those classic disaster movies of the 70’s
If you are from my generation, when you think of ‘Disaster’ you think of these classic movies.
So when you think about disaster recovery planning, you may be forgiven for thinking “What’s the point”, after all what is the chance of your business being knocked out by an earthquake, an avalanche or even a meteor!
However these aren’t the real disasters….
If Hollywood did movies about real business disasters, we’d be flocking to the cinemas to see titles like…
The reality is, a lot of businesses don’t have a DR plan because they don’t see disasters as something likely to affect them. However they may be looking at the wrong disasters. The reality of disasters is different to the Hollywood view
Most business disasters are attributed to more mundane events such as technical failures and the odd human error…
**Explain the table**
So what measures can you take to protect your business and plan for disasters?
There are many schools of thought around disaster recovery planning but all of them boil down to four core areas, and I’ll discuss these in a little detail over the coming slide.
At a high level this 4 step approach is about:
Knowing your business – identify the key components which are important to the day to day operation of your business
Assess the risks to these components
Document a plan which details these key components, the risks posed to them, and your plan of action to mitigate them
Test and review the plan to ensure that it works and that it remains current and up-to-date
Understand what’s critical to your businesses on-going operations…
Identify the risks
This involves assessing the impact, causes and likelihood of disruption to the process. It is addressed by asking the following questions:
What would the impact be if this process could not be carried out for 4 hours?
Considerations should include:
Financial costs incurred e.g. penalties or fines; reimbursement or compensation; unfavourable credit terms; loss of business
Operational impact: additional time taken to carry out tasks; limited access to facilities or expertise; limited access or lower level of service from suppliers
Personnel impact: additional working hours or stressful situations experienced by staff; personal liabilities incurred by directors; health and safety
Reputation: bad press, loss of customers or goodwill, community
When we come to map out the risks (see next section), these impacts are rated on a scale of 1-5 as follows:
1 Very low (barely noticeable and easily absorbed by the business)
2 Low (noticeable but easily absorbed by the business)
3 Medium (disruptive for short term)
4 Sizeable (disruptive for medium term but can be absorbed long term)
5 Substantial (highly disruptive and long recovery time)
What could the causes be for such as failure: Considerations should include:
People - how skilled, knowledgeable or experienced are the people currently responsible for the tasks? What might cause them to be unavailable to do their tasks?
Tools – how easy is it to misuse the tools? What might cause the tools, systems or networks to be unavailable for use?
Information – what could prevent access to the information needed? What could cause it to become corrupted, lost or unavailable?
Processes – are there any processes (or suppliers) that it relies on or trigger it?
Other – what environmental conditions could prevent the process being followed?
What are the risks to these critical business and IT systems?
The probability matrix is a useful guide for assessing risk
Evaluate current controls and residual risk
Now that we have the risks, we seek to understand their likelihood by asking the following questions:
What do we currently have in place to protect against this problem? Considerations should include:
People – how many people have the skills, knowledge or experience to carry out the tasks needed? How skilled, knowledgeable or experienced are the people currently responsible for the tasks?
Tools – how much redundancy is built into the tools, systems and networks needed for the process to work?
Information – how many places is it stored, how easily is it accessed, how reliable is it?
Processes – are there any processes (or suppliers) that it relies on and do they have any resilience or alternatives?
Other – how quickly could we identify a failure when it occurs? How often do we test and prove the current controls?
And so what is the likelihood (residual risk) for each of these possible causes? These are mapped out from 1-5 as follows:
1 Highly unlikely (not happened before, good protection in place, rare events needed for it to occur)
2 Unlikely (not happened before, good protection in place)
3 Possible (has happened before, adequate or improved protection in place)
4 Probable (has happened before, no changes in protection)
5 Inevitable (happens often or no protection in place)
When you’ve looked at what’s important to the survival of your business, and identified the risks posed to them, the likelihood is that you’ll have a pretty long list of business critical components mapped against an equally long list of risk scenarios.
This can be quite daunting as all of a sudden you are faced with the makings of a behemoth BCP.
A useful way of regaining control over the creation of your plan and to trim it down to it’s essentials is to use the probability matrix.
This will allow you to really think about a risk to a particular business component and assess it’s likelihood against it’s impact.
The probability matrix splits in to four sections, and by plotting the things you’ve identified as business critical in to each of these areas you’ll soon identify what really is important to your business.
One of my earlier slides stated that between 40-75% of businesses don’t have a DR plan. That’s not strictly true, they do have a DR plan… it’s just not a very good one…
A documented plan stops the panic.
Having a DR plan in place is not enough. You need to constantly test it and review it to make sure that it is relevant and working.
Put the plan in to practice to make sure it works as expected
Review it regularly and keep it up-to-date
It’s not just your DR plan that is important to your business, it’s the DR plan of your critical third party partners too. Do those organisations that you rely on for your day to day business operations have a DR plan in place themselves?
49% of respondents have not investigated or validated the recovery readiness of their critical suppliers, partners and other 3 rd parties
Use tools like the probability matrix to map your critical systems
Consider Cloud DR as a means of delivering better RTO and RPO’s
Make sure that any third party that supports your business have a DR plan, and that it works with yours.here any processes (or suppliers) that it relies on or trigger it?
Information – how many places is it stored, how easily is it accessed, h