Insight and analysis on the data center space from industry thought leaders.
Rebooting Deduplication in Your Next-Generation Data Center
As data centers become more dynamic and difficult to manage, and as data workflows become less linear, companies are faced with a new reality, writes Casey Burns of Quantum. Their deduplication strategies need a reboot.
October 6, 2014
Casey Burns, product marketing manager at Quantum, has extensive experience in the storage industry with a professional focus in data deduplication, virtualization and cloud.
The data center is evolving as IT departments grapple with virtualization, cloud, and the shift from traditional data protection to emerging policy-based approaches to storage. These trends are converging to make a complex mess of data management, despite their promise to streamline operations, scale IT resources and lead businesses to smarter, data-driven decision making.
As data centers have become more dynamic and difficult to manage, and as data workflows have become less linear, companies are facing a new reality: their deduplication strategies need a reboot. That includes rethinking what to expect from a purpose-built backup appliance.
Trends impacting deduplication and the new data center
A number of key business trends are driving the need to reboot deduplication, but there are three that stand out. First, businesses are increasingly tied to a variety of cloud architectures, and need to back up to these new environments effectively. Second, IT departments must now cope with even greater volumes of unstructured data —the associated metadata—and need to scale backup resources accordingly, while separating data suitable for dedupe from data that aren’t. Third, effectively prioritizing historical data for agile storage and retrieval has become increasingly critical to daily operations.
Together these trends have organizations coming to depend upon purpose-built storage to handle the complex nature of their industry-specific business challenges, including the workflows that govern the movement of their data. Some may take deduplication for granted, but it makes things like disaster recovery viable for even modest implementations. The value of deduplication will continue to grow in next-generation data centers, and the solutions offering the lowest OPEX will have the greatest chance for success.
Deduplication workflow considerations
There are a number of considerations to determine how deduplication should fit into an organization’s modern data center and workflows. However, there is no silver bullet technology to rein-in data center complexity. The type of data, content, and frequency of access required all need to be evaluated in order to find the best deduplication solution.
Virtual machines (VMs), for example, require many backup applications to work within more dynamic and virtual workflows, which they are ill-equipped to handle. This data type must be managed differently from traditional data.
Handling unstructured data growth is also becoming a vital part of any data protection strategy, whether it’s archiving video content for future re-monetization, offloading static data from primary storage, or building a private cloud infrastructure. Due to the scale and access requirements of storing this data, traditional backup simply won’t work.
Tiered storage technology, paired with deduplication, can help organizations align the value of their data with appropriate storage costs by applying the right technology at the right point of time. Taken a step further, we’re seeing many organizations—responding to data center complexity—now turn to a proactive data management model that is based on tiered storage, including backup tiers and active archive tiers that encompass smart data movement that fits within their unique workflows.
Five key qualities to look for in deduplication
With these considerations in mind, here are the five qualities to look for in a modern deduplication solution.
Purpose-Built Backup Appliances: Next-generation data centers have a level of complexity that demand deduplication appliances that are purpose-built to the task. Built to work with a full range of backup applications, they are typically easy to install, offer the highest performance, and serve as a disk target for backup servers. Gartner predicts that by 2018, 50 percent of applications with high change rates will be backed up directly to deduplication target appliances, bypassing the backup server, up from 10 percent today*.
Variable Length vs. Fixed Block Deduplication: Deduplication is not a pedestrian exercise. Software solutions typically have adopted a fixed block approach because it is the least compute intensive, but it generally doesn’t provide the maximum amount of data reduction. Variable block deduplication is resource intensive, but minimizes disk storage as data grows, and is the most efficient data reduction available, providing maximum disk storage density. Organizations that are a good fit for variable-length deduplication over a fixed-block approach include companies experiencing fast data growth, remote offices or virtualized environments. Variable-length deduplication can also cut network traffic – key for replication and disaster recovery.
Scalability: Pay-as-you-grow scalability provides simple, predictable and easy-to-install storage capacity. This allows users to increase capacity by simply entering a license key, with no other on-site installation needed. Physical and virtual appliances are now available that scale from 1TB to over 500TB using a capacity-on-demand approach, allowing customers to add capacity with minimal or no downtime. The benefit of this approach is that users can avoid overprovisioning their storage and purchasing more capacity than they need.
Monitoring and Reporting: Deduplication is hard to do well without the right management tools. Proactive monitoring and reporting of deduplication functions is often overlooked, but it enables precise business decision making and helps speed resolution time. Ideally, a data center manager should be able to monitor backups from their mobile device. Advanced reporting capabilities can free IT staff to focus on more strategic projects rather than managing backups.
Security: Organizations are increasingly scrutinizing the security of their data at every step of the workflow, in an attempt to eliminate vulnerabilities before they are exploited. Hardware-based, military-grade AES-256 bit encryption of data at rest and in motion provides security without sacrificing performance. Be aware that software-based encryption approaches can often incur a performance penalty.
Ready to reboot?
The trends are clear: As data center volume and complexity increases, it’s critical to not just have deduplication, but smart deduplication that fits within an evolving data center and workflows.
*Gartner, Magic Quadrant for Deduplication Backup Target Appliances, 31 July, 2014
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
About the Author
You May Also Like