Insight and analysis on the data center space from industry thought leaders.

Business Recovery: The Forgotten IT Metric

Business recovery is a critical metric for measuring a data center's IT effectiveness. But it's often overlooked in favor of traditional KPIs.

5 Min Read
Blue Data Recovery Screen Showing 25% Recovered
Image: Alamy

Traditionally, KPIs such as rate of change, service quality, and the speed of updates have formed the backbone of measuring IT effectiveness. Today, I’m urging IT execs to consider another, business-critical yet often overlooked or forgotten IT metric: business recovery.

The recent CrowdStrike outage in July 2024, which impacted over 8.5 million Microsoft Windows devices and resulted in over $1 billion in losses, exposed vulnerabilities in resilience planning for data centers worldwide.

Clearly, operational recovery has become more critical than ever. Yet, time to recover is often overlooked and can vary greatly depending on the technologies data centers have in place. In this article, we’ll cover key metrics to help evaluate business recovery effectiveness. You’ll also learn why the tools and strategies you have in place can dramatically affect time to recovery. 

But let’s back up for a minute and cover the basics.

What Is Business Continuity? 

Business continuity refers to an organization’s ability to maintain operations during and after a disaster or other disruption (e.g., a ransomware or other cyberattack). It should be a broad, organization-wide discipline that includes but is not limited to: 

  • Risk assessment and business impact analysis 

  • Data backup and disaster recovery (DR) technology and strategy 

  • Documentation of employee roles and responsibilities 

  • IT security technology and strategy 

  • Redundant infrastructure / power

  • Physical plant / HVAC 

  • Physical security

  • Vendor / supply chain management 

Related:Data Center Disaster Recovery: Essential Measures for Business Continuity

However, in a recent study by LevelBlue, formerly AT&T Cybersecurity, 69% of respondents said cyber resilience is not a whole-organization priority.

If this is true in your organization, engaging key stakeholders across the organization and involving them in business continuity and/or cyber resilience planning should be considered your first priority.

The scope of a data center business continuity plan will be dictated by an organizations’ specific recovery requirements. However, backup and DR strategy and technology are obviously fundamental to recovering normal business operations following an outage.

So, let's look at that piece in a little more detail. 

Three Key Business Recovery Metrics 

To minimize the business impact of IT downtime, you need to understand exactly how long it takes to restore business operations following an outage. The following metrics are essential for data center managers to accurately set recovery timelines: 

  1. Recovery Time Objective (RTO): The maximum acceptable time to restore systems after a disruption. 

  2. Recovery Point Objective (RPO): The maximum amount of tolerable data loss (expressed in time between backups). 

  3. Mean Time to Recover (MTTR): The average time taken to restore full system functionality. 

Related:Equinix Outages Through the Years: Key Incidents and Lessons Learned

Understanding these metrics is also the best way for data center managers improve upon existing plans, justify investment in new technologies, prioritize workload recovery based on business requirements, and more. Note that sometimes, these metrics will take different priority given your situation – it’s not always best to focus on RTO, despite common misbeliefs.

Direct-to-Cloud Backup and DR 

As noted above, the backup and disaster recovery technologies you have in place can have an significant impact on time to recovery. Backup and DR technologies that take and store backups in a format that is easily mounted as a virtual machine gained popularity over the last decade. These tools dramatically improve RTOs by giving data center managers a way to quickly and easily recover critical operations following an incident.

However, it is important to note that backup and disaster recovery tools are not created equally.

For example, many of the earlier products that came to market rely on "image-based" backup. These tools get the job done, but they can be highly inefficient from both a storage capacity and data transfer standpoint.

Related:The Biggest Threats to Data Center Uptime – and How to Overcome Them

That’s because image-based tools perform data deduplication at the hypervisor-level, rather than the file (or even sub-file) level. As a result, they simply cannot deliver the data deduplication performance of other, more modern tools. We are not talking about a trivial difference, either. Image-based products can be up to 60x less efficient than other backup and disaster recovery tools.

This limitation is why image-based products require expensive on-site appliances to function properly. These appliances act as a staging area for backups before they are sent offsite and to run recovery machines during an outage. This was standard operating procedure until threat actors began to target these devices specifically to limit organizations' ability to restore data and operations following a cyberattack. Not good.

Today, there are better options.

Modern Backup and DR Software

Modern backup and DR tools perform deduplication at the file or even sub-file level, delivering far better dedupe rates. This makes incremental backups extremely lightweight, enabling direct-to-cloud backups and eliminating the need for a local appliance. Increased dedupe performance also means you can take more frequent backups and store more recovery points to meet stringent RPOs without sending storage costs through the roof. 

When evaluating backup and DR tools, consider whether the product will make your life easier. For example, many backup products today offer immutable backup. However, the methods available to provide immutability vary widely. In some cases, additional hardware investment and ongoing retention management will be required. No bueno! Other products deliver always-on immutability with no need for extra management. Nice! 

Another critical consideration is the delivery model. Will you need to spend time installing updates and patching vulns or will that happen for you in the background? Backup and DR tools delivered as software as a service (SaaS) take this stuff off technicians' plates, allowing them to focus on more critical tasks. There are many other considerations, but you get the idea. The backup and DR tools you chose can have a large impact on technicians’ day-to-day. Choose wisely, as technician efficiency = increased productivity = increased revenue. 

Remember: The IT landscape has changed – why hasn’t your backup and DR?

About the Authors

Stefan Voss

VP of Product, Cove Data Protection by N-able

Stefan Voss is VP of Product at Cove Data Protection by N-able, where he helps internal IT customers and MSPs leverage cloud-first backup solutions for disaster recovery and business continuity. Connect with Voss on LinkedIn. Reach Stefan at [email protected].

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like