General enquiries :
+44 (0)20 7602 6000

Disaster recovery, downtime and high availability – how is your organisation effected?

Friday 2 October 2020 Data Insight & AnalyticsProject ManagementWorkforce Management

By Miles Reucroft

System downtimes and outages are very costly for businesses of all sizes. Of course, the bigger companies suffer eyewatering costs when they experience system outages: in 2019 a 14-hour system outage cost Facebook $90m and in 2016 a five-hour system outage cost Delta Airlines $150m with 2,000 flights cancelled. With the business costs of system outages potentially being so high, it is worth carefully considering the impact that any system outage may cause you.

How does your organisation organise its IT infrastructure? Some outages are seen as inevitable, for example where your system and software need upgrading, but with careful planning the impact of such downtime can be minimised. Planned downtime will be made clear in your service level agreement with your hardware or software provider. Non-planned outages are mitigated via the disaster recovery plans for the respective system.

Minimise disruption

Business continuity (BC) ensures that aspects such as backups are in place, that data is backed up at regular, logical points and even that backup, physical locations are designed in such a way to mirror the current working environment. Disaster recovery (DR) is scoped to cover unforeseen scenarios ranging from loss of power (at the level of a single server to a complete datacentre), hardware damage or loss through fire or theft, to prolonged system failure. Every organisation has such disaster recovery scenarios covered via its business continuity planning.

Aligning your software solution(s) to your BCDR (business continuity and disaster recovery) plan minimises disruption by ensuring minimal changes to your existing tried and tested disaster recovery processes.

This is where recovery point objectives (RPOs) and recovery time objectives (RTOs) are factored into the BCDR and high availability equations.

RTO is the acceptable time limit after an incident in which your system can be recovered and restored. This interlinks with the RPO, which defines the maximum time period it is tolerable for data to be lost. For example, if your system, with an RTO of three hours goes down, this means that the maximum period of time your organisation should expect to be down is three hours. If the RPO is also three hours, then when the system is back up the maximum amount of data that could be unavailable is three hours prior to the incident occurring.

Keeping your business running

The shorter the time frames for RTOs and RPOs, the higher the associated costs. Having an RTO and RPO close to zero (i.e. no down time and zero data loss) would require, effectively, ‘real-time backup’ with transactions replicated to an offsite system.

The same is true of high availability. Organisations require high availability in order that system outages are kept to an absolute minimum. For certain public bodies, organisations and institutions, such as those in the public sector including transport operators, healthcare, care service providers and hospitals, as well as private enterprise companies specialising in data, high availability is imperative. The higher the availability, however, the higher the cost, so it is important to consider all factors when implementing such measures.

Availability is measured in nines, from one nine, or 90%, to nine nines, or 99.9999999%. One nine equates to 35.53 days of downtime per year; nine nines equates to 31.56 milliseconds. What sort of availability your organisation requires is based upon what is tolerable to your organisation in terms of downtime, and the impact of this down time to your operations. With the possibility of conducting planned and scheduled upgrade works over weekends, many firms can accept a lower level of availability in return for reduced costs, keeping any disruption away from most of the workforce.

Therefore, it is important for software and service providers to engage with your overall BCDR strategy. For example, for companies that we supply our Cygnum software to, we understand that we’re only one component of their overall BCDR equation. It is, therefore, vital that we engage with the customer to understand their BCDR processes so that we can dovetail our solution and service level agreements to align with this. If their BCDR processes are not aligned across their technology solutions, then in the event where disaster recovery is required, inconsistent RPOs and RTOs will result in an inconsistent resumption of business as usual.

What’s right for your business?

We have seen an increase in demand across our customer base for highly available solutions recently. Those operating in care and transport, for example, require higher availability than those operating in less sensitive areas. Cygnum 2020 can now be installed with real-time site-to-site transaction processing implemented to give a true highly available solution.

It’s all about striking a balance and finding what is required in your business. Would losing an hour’s worth of data be acceptable? Would losing three? Or five? Similarly, does your system need to be fully functional all the time, or can you work around planned and scheduled downtime?

Working with you we can propose the best solution as regards implementing, maintaining and upgrading Cygnum in line with your technology stack, your BCDR processes and how you run your business. As our high availability and hosting solutions have evolved in line with client demand, our solutions have become more adaptable and scalable, meaning that higher availability solutions are more readily and affordably available than they have been in the past.

Depending on your needs, we can work with you to ensure that our service level agreement fits your requirements and budget.

For more information on Cygnum, please click here.

In the event of a system outage, how quickly does your system need to be back up and running? Feeding into your BCDR strategy, high availability can be tailored to your needs.

Disaster recovery, downtime and high availability – how is your organisation effected?