Failure is Not An Option: Lessons Learned from Disasters
Data Center World: Data center managers can draw both warnings and inspiration from studying disasters outside of their industry
October 21, 2014
ORLANDO, Fla. - As a passenger boards an airplane, she notices a crack in the fuselage near the door, but doesn't mention it to the flight crew.
Sound improbable? It happened in April 1988 prior to the takeoff of Aloha Airlines flight 243. Minutes later, a section of the top of the airplane tore off at a height of 24,000 feet. A flight attendant was blown out of the plane and killed, and dozens were injured. Miraculously, the flight crew was able to land the plane.
Why wouldn't someone report an obvious threat to safety? Adrian Porter posed the question to a room full of data center professionals Monday at the Data Center World conference. Porter, the senior manager for data management at 1-800-CONTACTS, spoke about real-world disasters and the lessons they hold for disaster recovery.
The consensus: The woman who spotted the crack in the fuselage assumed the airline must already know about it, and therefore it must be okay to fly.
That's not so improbable to imagine, Porter said, when you consider the number of warnings and alerts that IT professionals receive. Most don't rise to the level of the Aloha incident. But there are plenty of examples of failing to note warning signs - like the fact that the Aloha jet designed for 34,000 takeoffs had already logged 89,000 flights, leading to metal fatigue.
"Silence is not golden," said Porter. "If you see something, don't trust that someone else is taking care of it."
Overcommitment is dangerous
On other occasions, stakeholders get emotionally and financially committed to a goal and find it difficult to obey warning signs. As an example, Porter cited the 1995 expeditions to Mount Everest, in which a sudden blizzard at high elevation led to the deaths of eight climbers.
The climbers included teams from two expedition companies whose clients paid $65,000 for the Everest experience, and spent two months getting acclimated to the altitude. When a break in the weather offered an opportunity and the summit loomed, it became hard to turn back, even after the 2 p.m. deadline by which teams should begin their descent to find safety before nightfall.
"It's human nature to push forward," said Porter. "But sometimes the safest thing is to roll back. Stick to the script. Rollbacks are better than catastrophes."
Sometimes disasters provide examples of grace under pressure. Porter cited the example of Apollo 13, which was crippled by an explosion that forced three astronauts to power down their command module and use the attached lunar lander as a "lifeboat" to stay alive. The story, which is well known from the film starring Tom Hanks, ends with their safe return to earth - but only after crews of engineers from Mission Control improvised procedures to preserve critical functions of the damaged spacecraft.
Porter noted quotes from NASA Flight Director Gene Krantz, who counseled his engineering team to focus and "work the problem" until a solution was found. "Failure," Krantz noted, "is not an option."
About the Author
You May Also Like