Insight and analysis on the data center space from industry thought leaders.
Storage Disaggregation in the Data Center
With the availability of fast interconnect technologies, disaggregation of storage from the compute servers considerably reduces the total cost of investment in the data center in various ways, improves the efficiency of the storage utilization, adds to the resiliency of the storage stacks, and allows for pay-as-you-grow planning for the future of the data center, writes Gilad Shainer of Mellanox Technologies.
October 18, 2013
Gilad Shainer is Vice President of Marketing at Mellanox Technologies, a leading supplier of high-performance, end-to-end interconnect solutions for data center servers and storage systems.
gilad_tn
GILAD SHAINERMellanox Technologies
The massive explosion of data has been well-documented. Estimates predict that the same exponential growth that has seen the amount of available digital data worldwide rise from 200 exabytes to nearly 2 zettabytes (10X growth) between 2006 and 20111 will explode to over 8 zettabytes of available digital data by 2015.2 Nearly 70 percentage of companies with over 500 employees claim to manage over 100 terabytes of data, and nearly 40 percentage manage over a petabyte.3
Even these astounding numbers pale in comparison to the tremendous amounts of data that some of the world’s leading corporations are producing and managing, thanks to the advent of cloud computing, Web 2.0, high performance computing and Big Data. The quantities of data are so staggering that traditional data centers are becoming obsolete without huge investments toward upgrading to flexible, scalable solutions.
The Importance of Data Analysis
Beyond the issue of scalability, various other challenges arise for these companies including organizing, backing up and recovering such massive quantities of data. But perhaps most important among these challenges is how to analyze and associate the data to improve business decisions and increase profits. Data analysis is vital to a company’s efforts to model customer behavior, improve production, inform sales and marketing decisions and combat negative impressions and fraudulent activities.
Sorting through the data to identify patterns and trends so as to act on that information is crucial to the bottom line. Those who are successful in analyzing data are likely to be ahead of the curve in their decision-making.
The Evolution of the Data Center
As the quantities of data have surged and as companies have attempted to accommodate this vital need to analyze data in such large quantities, data centers have been forced to evolve to address the ever-changing requirements. In addition to constantly adjusting the data center storage capacity to handle the vast amount of data, the data center architecture has had to adapt to demands for faster and more robust data analysis.
To understand this evolution, it is useful to trace the history of the data center from its most basic roots to today’s immense facilities, and to note the technological rationale for changes along the way.
The earliest version of a data center was nothing more than a mainframe computer that contained a CPU, a memory cache and storage – all in one container. The concept of the network had yet to develop, so all the functions of the data center were contained in one central location.
Once the network was introduced, it became commonplace to separate the storage components from the compute components of the network. This had the advantage of allowing increased, dedicated storage that could be better utilized than if it were to be bundled with the CPU.
Figure 1: Original Data Center Architecture
However, the proliferation of data that has been building for the past decade and the corresponding demand for data analysis changed the makeup of the typical data center once again. The existing interconnect technologies were far too slow to accommodate the requirements for real-time (or even reasonably quick) processing of the large quantities of data leading to relevant responses with analytical information. Most requests for data analysis took weeks to fulfill and by then it was too late to capitalize on the information.
To address the poor interconnect performance, data center solutions began offering storage bundled into the compute servers. By reducing the distance between the compute and the storage to near-zero, companies gained the ability to access data immediately, enabling much faster analysis and enhancing their business decision-making abilities.
Figure 2: Storage Aggregated with Compute
However, penalties were paid for the move to an aggregated data center. The new servers offered less flexibility, higher cost and more wasted storage space than their disaggregated predecessors. When solid-state drive (SSD) storage became the storage technology of choice, offering even faster performance between compute and storage, the cost of maintaining the aggregated data center became even more expensive.
Figure 3: Solid-State Drive Storage Aggregated with Compute
Fast Interconnect Enables Disaggregation
Thanks to fast interconnect technologies such as InfiniBand, RDMA and RoCE (RDMA over Converged Ethernet), it has become possible to send and receive data at speeds as high as 56Gb/s, with 100Gb/s soon to come. Moreover, there is virtually no latency whatsoever, with less than 1 microsecond latency available even at such high interconnect speeds.
The improvement in interconnect has allowed the disaggregated data center to once again become a reality. It is now possible to move storage away from the compute with no penalty to performance. Data analysis is still possible in near real time because the interconnect between the storage and the compute is fast enough to support such demands.
Benefits of Disaggregation
The advantages of disaggregating the storage are significant. By separating storage from the compute, IT managers now have the flexibility to upgrade, replace, or add individual resources instead of entire systems. This also allows IT managers to plan better for future growth, adding storage only when necessary, which then provides the benefit of better utilization of available storage space and budget control.
Furthermore, the type of storage that is used can now be tailored to the data therein. While it makes sense to use SSD storage for data that must be instantaneously at your fingertips, there is also plenty of data in storage that is infrequently accessed, for which SSD storage is unnecessarily expensive and underutilized. For such backend data, which requires lots of capacity but less speed, it makes sense to return to the slower but far less expensive SAS or SATA storage.
Figure 4: High Speed Interconnect Enables Storage Disaggregation with Zero Penalty to Performance
With the availability of fast interconnect technologies, disaggregation of storage from the compute servers considerably reduces the total cost of investment in the data center in various ways, improves the efficiency of the storage utilization, adds to the resiliency of the storage stacks, and allows for pay-as-you-grow planning for the future of the data center.
Author's Note: I would like to thank Brian Klaff and Michael Kagan for their contributions throughout the development of this article.
Works Cited
1 IDC, “IDC Worldwide Big Data Technology and Services 2012-2015 Forecast”, #233485, March 2012.
2 CenturyLink, “Big Data: Defining the Digital Deluge”.
3 Channel Insider, “Data Explosion Makes Backup, Recovery A Challenge”.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
About the Author
You May Also Like