Data Center Downtime: (Expensive) Downer

Enterprises that run their own data centers do so at great financial risk.
Control has its costs — even above and beyond the hefty price of equipment, labor, infrastructure, power, maintenance, upgrades, spares, etc.
Indeed, the average cost of data center downtime is $7,900 per minute.
Ouch. Hope the uptime is worth it. If not, give AIS a call!
Summary article by David Weldon in FierceCIO and the actual report findings from the Emerson Network Power website.
Emphasis in red added by me.
Brian Wood, VP Marketing

Downtime durations drop slightly, while costs rise steeply

Data center downtime is now costing organizations $7,900 per minute on average, according to new research.
The good news: downtime durations are dropping. The bad news: the average cost per incident is rising.
The Ponemon Institute has just released its latest study on the impact of downtime in the data center, and the findings show a 41 percent increase in cost impact since 2010. Ponemon surveyed 584 data center professionals for its latest study. Perhaps more disturbing than the financial toll of downtime: the reported number of data centers that have experienced unplanned outages in the past 24 months.
The study revealed that 91 percent of data centers have experienced outages, with an average of two outages per organization, with average downtime duration of two hours. Still, that number does represent a slight improvement from three years ago, when 95 percent of surveyed data centers reported outages.
Based on the Ponemon Institute calculations, these two hour outages cost organizations just over $900,000 on average, or more than $7,900 per minute, ZDNet reported.
In addition, partial outages (defined as the outage of one or more racks within the data center) had an average recovery time of less than one hour with an associated cost of approximately $350,000.
The Ponemon study said that the cost of data center downtime is growing due to the increased value of data center operations to companies.
“Given the fact that today’s data centers support more critical, interdependent devices and IT systems than ever before, most would expect a rise in the cost of an unplanned data center outage compared to 2010,” noted Larry Ponemon, chairman and founder of the Ponemon Institute. “However, the 41 percent increase was higher than expected. This increase in cost underscores the importance for organizations to make it a priority to minimize the risk of downtime that can potentially cost thousands of dollars per minute.”
Reporting on the Ponemon research findings, ZDNet noted that “it should come as no surprise that as cloud and network based services become more prevalent the impact on businesses when these services fail has become more pronounced.”
In addition to calculating the current cost of downtime per minute, other highlights of the study included:

  • The average reported downtime incident is 86 minutes, at an average cost of $690,020 (In the Ponemon Institute’s 2010 downtime report the numbers were 97 minutes for the average duration, and an average cost of $505,500).
  • A total data center outage now lasts for an average of 119 minutes, at an average cost of $901,500 (In the 2010 study the numbers were 134 minutes average duration at an average cost of $680,700).
  • A partial data center outage now lasts approximately 56 minutes at an average cost of $350,400 (in the 2010 study the numbers were 59 minutes in duration on average at an average cost of $258,000).

The Ponemon study looked at 67 data centers with a minimum size of 2,500 square feet. It examined data center outage costs due to direct, indirect and lost opportunity factors.
Such factors include “damage to mission-critical data, the impact of downtime on organizational productivity, damage to equipment, legal and regulatory repercussions, and lost confidence and trust among key stakeholders,” according to Data Center Knowledge.
Certain industries were especially susceptible to downtime outages, the study noted. The four industries with the largest reported increases were: hospitality (129 percent increase); the public sector (116 percent increase); transportation (108 percent increase); and media organizations (104 percent increase).
“The most positive aspect of the report,” ZDNet noted, “is that in the three years between studies, datacenters, on the whole, have become more reliable, with both the number of incidents and their duration going down. But with the nature of the cloud-based business model, both public and private, the reliability of your data center servicers and the supporting infrastructure will become exponentially more important to the business bottom line.”


The Lowdown on Data Center Downtime: Frequency, Root Causes and Costs

In 2013, Emerson Network Power again partnered with the Ponemon Institute to update its Study of Data Center Outages. The two-part study found that although the frequency and duration of data center downtime events has slightly decreased, unplanned outages prove to remain a costly line item for organizations.
The first part of the study, which surveyed more than 450 U.S.-based data center professionals and focused on the root causes and frequency of downtime, found that organizations are more aware of data center downtime and its potential consequences, and are increasingly taking action to prevent outages.
The second part of the study includes an analysis of 67 U.S. data centers with a minimum size of 2,500 square feet, delving into the direct, indirect and opportunity costs associated with data center outages.

Downtime costs are increasing

The Cost of Downtime study quantifies the costs of an unplanned data center outage at slightly more than $7,900 per minute, which is a 41 percent increase from the $5,600 it was in 2010. Total data center outages averaged a recovery time of 119 minutes, equating to about $901,500 in total costs.
Partial outages, or those limited to certain racks, averaged 56 minutes in length and costs were approximately $350,400.

Outages are less frequent

An overwhelming majority of 2013 Causes of Downtime survey respondents reported having experienced an unplanned data center outage in the past 24 months (91 percent). This is a slight decrease from the 95 percent of respondents in the 2010 study who reported unplanned outages.
Regarding the frequency of outages, respondents experienced an average of two complete data center outages during the past two years. Partial outages, or those limited to certain racks, occurred six times in the same timeframe. The average number of device-level outages, or those limited to individual servers was the highest at 11. These durations have declined slightly from 2010 findings (complete: 2.5, partial: 7, device level: 10).

Root causes of downtime

The most frequently cited total expense of unplanned outages in the Cost of Downtime study includes:

  • IT equipment failure ($959, 000)
  • Cyber crime ($882,000)
  • UPS system failure ($478,000)
  • Water, heat or CRAC failure ($517,000)
  • Generator failure ($501,000)
  • Weather incursion ($436,000)
  • Accidental/human error ($380,000)

Common causes of outages

Eighty-three percent of survey respondents in the Causes of Downtime study said they knew the root cause of the unplanned outage. The most frequently cited root causes of outages include:

  • UPS battery failure (55 percent)
  • Accidental EPO/ human error (48 percent)
  • UPS capacity exceeded (46 percent)
  • Cyber attack (34 percent)
  • IT equipment failure (33 percent)
  • Water incursion (32 percent)
  • Weather related (30 percent)
  • Heat related/CRAC failure (29 percent)
  • UPS equipment failure (27 percent)
  • PDU/circuit breaker failure (26 percent)

Fifty-two percent believe all or most of the unplanned outages could have been prevented.

Minimizing cost and defying downtime

While eliminating downtime altogether is a difficult and somewhat challenging undertaking, the studies did identify common attitudes and behaviors for reducing costs:

  • Consider data center availability the highest priority above all others, including cost minimization and improving energy efficiency.
  • Utilize all best practices in data center design and redundancy to maximize availability.
  • Dedicate ample resources to bring the data center up and running in case of an unplanned outage.
  • Have complete support from senior management on efforts to prevent and manage unplanned outages.
  • Regularly test generators and switchgear to ensure emergency power in case a utility outage does occur.
  • Regularly test or monitor UPS batteries.
  • Implement data center infrastructure management (DCIM).