Google Reimburses Cloud Clients After Massive Google Compute Engine Outage

Google is reimbursing Google Compute Engine users up to 25 percent of their monthly charges after an outage that impacted instances across all regions on Monday.

Nicole Henderson, Contributor

April 14, 2016

2 Min Read
Google Reimburses Cloud Clients After Massive Google Compute Engine Outage
Attendees at Google Cloud Platform Next 2016 conference in San Francisco in March viewing the 360-degree virtual tour of a Google data center (Photo: Yevgeniy Sverdlik)

Talkin-Cloud-logo.png

By Talkin' Cloud

Google is reimbursing Google Compute Engine users up to 25 percent of their monthly charges after an outage that impacted instances across all regions on Monday.

The outage lasted 18 minutes, and did not affect Google App Engine, Google Cloud Storage, or other Google Cloud Platform products. While 18 minutes may not sound like a lot of time, in the cloud world it is. And because the outage impacted multiple regions, it meant clients couldn’t failover to a new region in order to mitigate the impact of the outage.

According to a lengthy and apologetic post mortem on the Google Cloud Platform status page on Wednesday, the issue began when engineers removed an unused GCE IP block from its network configuration and instructed its systems to propagate the new configuration across the network.

“By itself, this sort of change was harmless and had been performed previously without incident. However, on this occasion our network configuration management software detected an inconsistency in the newly supplied configuration,” Google VP of engineer Benjamin Treynor Sloss said.

“In attempting to resolve this inconsistency the network management software is designed to ‘fail safe’ and revert to its current configuration rather than proceeding with the new configuration. However, in this instance a previously-unseen software bug was triggered, and instead of retaining the previous known good configuration, the management software instead removed all GCE IP blocks from the new configuration and began to push this new, incomplete configuration to the network.”

To read more about the specifics and the timeline of the outage, check out Google’s post mortem which goes into more detail.

According to Google, its engineering teams will be working over the next several weeks on a “broad array of prevention, detection and mitigation systems intended to add additional defense.”

“It is our intent to enumerate all the lessons we can learn from this event, and then to implement all of the changes which appear useful,” he said, noting that “there are already 14 distinct engineering changes planned spanning prevention, detection and mitigation.”

Original article appeared at http://talkincloud.com/cloud-computing/google-reimburses-cloud-clients-after-massive-google-compute-engine-outage

Read more about:

Google Alphabet

About the Author

Nicole Henderson

Contributor, IT Pro Today

Nicole Henderson covers daily cloud news and features online for ITPro Today. Prior to ITPro Today, she was editor at Talkin' Cloud (now Channel Futures) and the WHIR. She has a bachelor of journalism from Ryerson University in Toronto.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like