Share

Prepare for disaster

Prepare for disaster
Strategy

Prepare for disaster

Governance | Jay Patel | 24 Jun 2010

Being properly prepared when disaster strikes can make all the difference - and save lives, says Jay Patel.   

One of the main hurdles many organisations face when it comes to implementing a disaster recovery (DRP) or business continuity plan (BCP) is quite simply how to start. Common sense tells us to break things down into less daunting steps that can simplify the process to get it completed more efficiently.

The most important and quite often the most expensive resource for any organisation to consider is its people. Assets such as property, materials, IT hardware/ software can be replaced, however people cannot. It is therefore advisable to safeguard the safety of all employees before starting anything else. When putting a disaster recovery plan in place it is important to realise that when a real disaster strikes some of your key employees in charge of your key business-critical areas may be away from work, on holiday or off sick. So, you’ll have to take this into consideration and have a plan B.

Consider your people first and foremost and always expect the unexpected. 

BCM Programme Management

Getting started with a disaster recovery plan

Impact and risk analysis – Identify mission-critical systems first

Recognise which systems are critical to run the business process and concentrate on protecting those as a matter of priority. Understand the level of risk and impact as not all systems require equal levels of cover. In order to understand where a higher level of protection is needed, perform a risk assessment, identify and prioritise the systems that are mission-critical. Ask questions such as: ‘What keeps your organisation going? Is it email? Accounts? Databases?’

Get all critical data off-site

In the event of a server failure or an unplanned outage you will be able to recover data if you have a copy stored off-site. Even if your data requires being restored to another location, at least your data will be available. Remember that your data is the second most valuable asset.

Understand the cost implication of downtime

This will enable you to understand which areas of your business require what levels of protection. Remember that while the unavailability of some systems may not create a large commercial impact, there may be legal or reputational impact if they are not available or recoverable.

Tape backup is not always adequate

This is by far the most common way for protecting and recovering data, but it may well not be adequate for all of your applications. Tape backup is acceptable for long-term archival and recovery, but it can take a long time to rebuild a system entirely from tape.

Options other than tape

When tape backup is not sufficient, consider other alternatives such as real-time replication to an off-site facility, either managed by a thirdparty provider or implemented in-house if a secondary site is available. It should be noted that off-site replication services should not negate the necessity of maintaining your routine tape backup procedures. Real-time replication solutions can provide for near to zero-time data loss which thus allows immediate system fail-over and data availability. There are various flavours of replication software available in the replication market. Hosted replication is often more cost-effective and the most flexible as it can work with many types of storage technologies, integrate with your IT infrastructure and provide excellent disaster recovery solutions.

Make your disaster recovery plan a part of your normal working routine

Plan for the different types of outages, including a simple defunct hard disk, system hardware failure, a software malfunction, human error, virus or spam attack, a building outage, a regional power failure, environmental disasters, and natural disasters eg hurricanes and river overflows. Ensure procedures are well documented and made available for everyone. Note that all staff members have an important responsibility in the event of a disaster. It’s important that all staff understand their roles during a crisis.

Disaster recovery plan or business continuity plan – what’s the difference?

 The difference between these two plans should not be confused. A disaster recovery plan (DRP) is specifically for IT systems within a specific location or a few locations. A business continuity plan (BCP) can be is normally thought of as the comprehensive corporate plan. A business continuity plan is a holistic management process that identifies potential threats to an organisation and the impacts to business operations that those threats, if realised, might cause. It also provides a framework for building organisational resilience with the capability for an effective response that safeguards the interests of its key stakeholders, reputation, brand and value-creating activities.

BCM Overview

Creating a realistic business continuity plan can take anything up to nine months, so during this time a disaster recovery solution should be implemented to minimise any business disruption, ie off-site data replication, off-site server mirroring, off-site backup etc.

Disaster recovery  planning

Each organisation will differ so your disaster recovery plan should be catered to suit your requirements with the importance placed on your data. Performing a business impact analysis and risk assessment will identify the requirements of the organisation and steer it towards the creation of a disaster recovery plan. Try to visualise what is required at various levels of a disaster or an occurrence, as different situations will invoke completely different procedures, for example, a server or hard disk failure would invoke different recovery procedure than a fire or explosion that can potentially destroy an entire building.

Don’t be afraid to start implementing

Do not wait until you have a complete DR/BCP plan in place to start protecting your organisation. Get your data offsite to a different physical location. Storing a copy of your critical data at another remote facility will allow you to recover and get back to business quickly. Your daily and weekly backup tapes can be stored off-site easily and can be accessible in the case of recovery.

Two key factors to understand when determining priorities are recovery point objective (RPO) and recovery time objective (RTO). RPO is the target point to set for resumption of product, service or activity delivery after an incident. For certain applications, recovering data from yesterday or even last week might be sufficient, thus, the RPO would be days or weeks. For applications and data, where any loss is not acceptable, an RPO of minutes or less is applicable.

While RPO defines how much data is protected, RTO defines how long it takes to recover that data. RTO is the amount of time the application can be down and not available to users or customers.

BCM disaster timeline

Testing saves lives

It is imperative to test your plan and a plan is only as good as it is when it is actually invoked. After plan is completed, it is crucial that it is tested adequately to ensure that in a real disaster the plan actually works. Testing the DRC and BC plans provides excellent training for all in your organisation.

I recall when the tragic events of 11 September 2001 and organisations’ disaster recovery plans were initiated. Suddenly staff had to recall the evacuation plan. For organisations that regularly practised the evacuation drill, evacuating the building was seamless and some even made their way to their disaster workplace recovery offices.

For the handful of companies that didn’t have a disaster recovery plan or didn’t practise their fire evaluation drill, panic took over. Some staff made their way upward toward the roof of the Twin Towers in hope of helicopter rescue, but the roof access doors were locked. No plan existed for a helicopter rescues, and on 11 September the thick smoke and intense heat would have prevented helicopters from conducting rescues. 

Regular testing is recommended, this way all people at all levels in the company know what to do in an emergency and are aware of the role they play in an invocation scenario.

Tests should be scheduled quarterly or at least every six months in order to take account of any new staff or system changes. For Morgan Stanley top executive Robert Scott, who helped his company survive the heavy toll from 11 September, one leadership lesson is particularly clear. “If you wait for a crisis to begin to lead, it’s too late,” said Scott.

Scott said that 32 years on Wall Street did little to prepare him for the terrorist attacks. But he found that a range of factors, from disaster contingency plans to the actions of well-trained managers, enabled Morgan Stanley – the largest tenant in the World Trade Centre – to come through the disaster with relatively little loss of life. Six of Morgan Stanley’s 3,700 employees died in the attacks.

In the 20 minutes between the first and second plane crashes, Morgan Stanley had implemented an evacuation plan put into place after the 1993 terrorist attack on the World Trade Centre. “It turned out we had most of our people off the high floors before the second plane hit,” Scott said.

Meanwhile, employees in charge of operations, having been drilled in what to do in the event of disaster, walked 22 blocks to Morgan Stanley’s backup site and turned on the computers. “By 9.20am, the backup site was activated,” said Scott. “By 9.30 am senior management had relocated to another site that became our command facility.” Lessons from disaster: Preparedness counts.

Summary

The information in the above narrative provides basic information for building a disaster recovery plan. It is important to realise that while a complete DRP or BCP may take some considerable time there are immediate plans that can be put in place, ie off-site data replication, off-site server replication, off-site data backup etc which are extremely costeffective these days and if implemented correctly can enhance the IT support function as well. In essence your DR plan is a portion of the larger BC plan. It is recommended not to leave your systems and thus your organisation at risk during the BC planning process; take immediate action to safeguard them and seek assistance from experts in this area. 

Jay Patel is managing director at Newton IT

Comments

[Cancel] | Reply to:

Close ยป

Community Standards

The civilsociety.co.uk community and comments board is intended as a platform for informed and civilised debate.

We hope to encourage a broad range of views, however, there are standards that we expect commentators to uphold. We reserve the right to delete or amend any comments that do not adhere to these standards.

We welcome:

  • Robust but respectful debate
  • Strongly held opinions
  • Intelligent relevant discussion
  • The sharing of relevant experiences
  • New participants

We will not publish:

  • Rude, threatening, offensive, obscene or abusive language, or links to such material
  • Links to commercial organisations or spam postings. The comments board is not an advertising platform
  • The posting of contact details for yourself or others
  • Comments intended for malicious purpose or mindless abuse
  • Comments purporting to be from another person or organisation under false pretences
  • Gratuitous criticism, commentary or self-promotion
  • Any material which breaches copyright or privacy laws, or could be considered libellous
  • The use of the comments board for the pursuit or extension of personal disputes

Be aware:

  • Views expressed on the comments board are left at users’ discretion and are in no way views held or supported by Civil Society Media
  • Comments left by others may not be accurate, do not rely on them as fact
  • You may be misunderstood - sarcasm and humour can easily be taken out of context, try to be clear

Please:

  • Enjoy the opportunity to express your opinion and respect the right of others to express theirs
  • Confine your remarks to issues rather than personalities

Together we can keep our community a polite, respectful and intelligent platform for discussion.