Why Google Wants to Rethink Data Center Storage

Taking a closer look at the fundamental redesign of storage Google is proposing, from system to the physical disk itself

Bill Kleyman, CEO and Co-Founder

May 2, 2016

6 Min Read
Why Google Wants to Rethink Data Center Storage
People stand in the lobby of Google’s Washington, DC, headquarters in January 2015. (Photo by Mark Wilson/Getty Images)

Growth forecasts for data center storage capacity show no signs of slowdown. Cisco expects that by 2019, 55 percent of internet users (2 billion) will use personal cloud storage -- up from 42 percent in 2014. By 2019, a single user will generate 1.6 Gigabytes of consumer cloud storage traffic per month -- up from 992 megabytes per month in 2014. Finally, data created by devices that make up the Internet of Things, which Cisco calls "Internet of Everything," will reach 507.5 Zettabytes per year by 2019 -- up from 134.5 ZB per year in 2014.

Needless to say, that's a lot of data, which will require a lot of storage, and Google is proposing a fundamental change to the way engineers think about and design data center storage systems, a rethink that reaches all the way down to the way optical disks are designed.

Cloud Needs Different Disks

At the 2016 USENIX conference on File and Storage Technologies (FAST 2016), Eric Brewer, Google's VP of infrastructure, said the company wanted to work with industry and academia to develop new lines of disks that are a better fit for data centers supporting cloud-based storage services. He argued that the rise of cloud-based storage means that most (spinning) hard disks will be deployed primarily as part of large storage services housed in data centers. Such services are already the fastest-growing market for disks and will be the majority market in the near future.

Read more: Intel: World Will Switch to Scale Data Centers by 2025

He used Google's subsidiary YouTube as an example. In a recent paper on Disks for Data Centers, Google researchers pointed out that users upload over 400 hours of video every minute, which at one gigabyte per hour requires adding 1 petabyte (that's 1 million Gigabytes) of data center storage capacity every day.

This is a tough reality to face for an industry that's so dependent on this one fundamental technology. The current generation of disks, often called “nearline enterprise” disks, are not optimized for this new use case; they are designed around the needs of traditional servers. Google believes it is time to develop a new kind of disks designed specifically for large-scale data centers and cloud services.

Google isn't the only one looking for new answers to the problem of relentless growth in demand for storage capacity. Researchers from Microsoft and the University of Washington recently published a paper that advocates for further exploration of using DNA strands to encode and store data and proposes major improvements to such encoding systems that have been explored thus far.

Let's take a step back and look at storage from Google’s perspective. First of all, the company says you should stop looking at individual disks (and even arrays) as standalone technologies. Rather, it’s time to focus on the “collection.”

The key changes Google proposes fall in three broad categories:

  1. The “collection view,” in which we focus on aggregate properties of a large collection of disks

  2. A focus on tail latency derived from the use of storage for live services

  3. Variations in security requirements that stem from storing others’ data

Taking the Collection View

The collection view implies higher-level maintenance of bits, including background check-summing to detect latent errors, data rebalancing for more even use of disks (including new disks), as well as data replication and reconstruction. Moderns disks do variations of these internally, which is partially redundant, and a single disk by itself cannot always do them as well. At the same time, the disk contains extensive knowledge about the low-level details, which generally favors new APIs that enable better cooperation between the disk and higher-level systems.

The third aspect of the collection view is that we optimize for the overall balance of IOPS and capacity, using a carefully chosen mix of drives that changes over time. We select new disks so that the marginal IOPS and added capacity bring us closer to our overall goals for the collection. Workload changes, such as better use of SSDs or RAM, can shift the aggregate targets.

Why Not SSDs?

But why are we talking so much around spinning disks rather than SSDs, which are much faster and whose cost has been coming down?

Arguably, SSDs deliver better IOPS and may very well be the future of storage technologies. But according to Google, the cost per GB remains too high. More importantly, growth rates in capacity per dollar between disks and SSDs are relatively close (at least for SSDs that have sufficient numbers of program-erase cycles to use in data centers), so cost will not change enough over the coming decade. Google does make extensive use of SSDs, but it uses them primarily for high-performance workloads and caching, which helps disk storage by shifting seeks to SSDs.

Redesign the Disk

Now things get even more interesting. Google is essentially calling on the industry to come to the round table to create a new standard for disk design.

As the company points out, the current 3.5” HDD geometry inherited its size from the PC floppy disk. An alternative form factor should yield a better total cost of ownership. Changing the form factor is a long-term process that requires a broad discussion, but Google believes it should be considered. Although it could spec its own form factor (with high volume), the underlying issues extend beyond Google’s designs, and developing new solutions together with the industry will better serve everybody, especially once a standard is achieved. And that’s one of the key points here: standardization.

There is a range of possible secondary optimizations as well, some of which may be significant. These include system-level thermal optimization, system-level vibration optimization, automation and robotics handling optimization, system-level helium optimization, and system-level weight optimizations.

What’s Next for "Legacy" Data Center Storage?

Yes, cloud-based storage continues to grow at amazing speed. Yes, we’re seeing even more adoption of new end-point technologies, IoT, and virtualization. All of these are creating more demand around storage and data optimization.

But before you get flustered and start looking at the storage alternatives of the future, you have to understand how big of an undertaking Google is proposing. It is suggesting a redefinition of the modern, standardized disk architecture, an architecture that’s been around for quite some time.

In 1956, IBM shipped the first hard drive in the RAMAC 305 system. It held 5MB of data at $10,000 a megabyte. The system was as big as two refrigerators and used 50 24-inch platters. In 1980, Seagate released the first 5.25-inch hard disk. Then, in 1983, Rodime released the first 3.5-inch hard drive; the RO352 included two platters and stored 10MB.

In their paper, Google discusses physical changes, such as taller drives and grouping of disks, as well as a range of shorter-term firmware-only changes. They discuss their goals which include higher capacity and more I/O operations per second in addition to a better overall total cost of ownership. But even with Google’s size and market sway, how feasible is this?

We’re talking about creating a new storage standard for every business, data center, and ecosystem leveraging disk-based environments. Google believes this is will be the new era of disks for data center storage.

It seems like a huge lift. But maybe it’s really time to take a step back and look over a technology that’s decades old. Maybe it’s time to develop a storage environment capable of meeting the demands of cloud-scale ecosystems. Either way, this is no easy task and will require the support of the industry to make adoption a reality.

Read more about:

Google Alphabet

About the Author

Bill Kleyman

CEO and Co-Founder, Apolo

Bill Kleyman has more than 15 years of experience in enterprise technology. He also enjoys writing, blogging, and educating colleagues about tech. His published and referenced work can be found on Data Center Knowledge, AFCOM, ITPro Today, InformationWeek, Network Computing, TechTarget, Dark Reading, Forbes, CBS Interactive, Slashdot, and more.

Subscribe to the Data Center Knowledge Newsletter
Get analysis and expert insight on the latest in data center business and technology delivered to your inbox daily.

You May Also Like