Insight and analysis on the data center space from industry thought leaders.
How to Make Big Data Agile Without Compromising Privacy?
There are a few critical processes that companies can employ to take full advantage of delivering more relevant applications faster, while still adapting to the necessary privacy processes to work in this world of big data sets.
December 21, 2016
Nitin Donde is the Founder and CEO of Talena, Inc.
In just three years, it is expected that more than 35 zettabytes of data will be generated worldwide, a 44 percent rise since 2009. Perhaps more surprising is that 80 percent of this data will be managed and protected by enterprises -- creating a natural dichotomy between the need for business agility and the need to ensure adequate consumer privacy. As today’s digital world becomes an all-knowing one, how do we keep consumer data safe -- especially when any breakdown in privacy would be catastrophic for the keeper of it?
Why is this debate of even greater importance now? Because companies increasingly understand that data assets can and will drive significant market share or revenue gains. Witness the recent IBM acquisition of Weather Channel and the Morningstar acquisition of Pitchbook. Furthermore, these growing data assets rarely stay static in a single infrastructure environment. They are always on the move -- from a storage environment to an analytics cluster, or from an on-premise data center to the public cloud -- often in the name of greater business agility.
The combination of volume and velocity of these bigger data assets make many privacy advocates nervous about the potential for not just data breaches (hello, Yahoo!) but also for data exposure to employees not authorized to view these assets. Yet, there are steps that enterprises can take to overcome the agility-privacy divide and still emerge successfully in a world increasingly based on rapid processing of large data sets.
The common dictum that "no enterprise is an island" applies equally to their big data infrastructure. Companies will use different big data systems for different business purposes -- they will collect and store data in Hadoop while running highly scalable analytics on HPE Vertica, for example. Since these data assets are often on the move, there is a need for transparency around the customer privacy contract. The bulk of privacy concerns are often related to the notion that data is "secure" and won’t be knowingly or unwittingly shared with any third party. But this does not always address the concept of unintended internal access to these data sets.
Complicating matters in the United States is the fact that there is a broad array of laws and regulations applied to data privacy, unlike some countries that have a “one privacy size fits all data” approach. Depending on the data sets involved, a company could have many or few privacy regulations applied to their data sets.
There are three primary threats that can affect the continued safety and privacy of company data sets: a breach from the outside, insider theft, and unintentional internal exposure.
According to a PwC study, those companies that can successfully blend risk-agility (the idea of rapidly changing risk management infrastructure to adapt to changing market conditions) with risk-resiliency (the idea of withstanding business disruption with solid processes, tools and controls) are typically the "highest performers", showing high growth with appropriate privacy and risk controls. In the PwC study, this was just 36 percent of the survey respondents, indicating a significant opportunity for companies worldwide to reach the ideal privacy-growth mix.
There are a few critical processes that companies can employ to take full advantage of delivering more relevant applications faster while still adapting to the necessary privacy processes to work in this world of big data sets:
Ensure that your IT team understands the business context for these data assets so they can deploy the necessary products and implement the relevant processes to support the privacy/security mandates for the company, yet still support the business. There is no one-size-fits-all model and each enterprise should have its own strategy.
Consider leveraging production data as part of your QA process to reduce application errors and hasten time to market. Application teams have always known that testing with production data (as opposed to leveraging synthetic data) leads to higher-quality applications and greater customer satisfaction.
Given the above, you need to understand how encryption and data masking fit into the overall application iteration process so that data is not unwittingly seen by your engineering, QA, and other team members associated with your test data process
Finally, ensure that you have appropriate user policies associated with not just the data stores themselves but the data management processes as well.
Conclusion
In a Cap Gemini survey, about 60 percent of companies surveyed indicated that the data they hold is a core component of their market value — the recent spate of acquisitions is just another indicator of this data-centric trend. It’s incumbent on all companies to think in parallel with their data acquisitions and know exactly how they can remain agile yet provide the necessary safeguards so customers feel comfortable interacting and engaging with this new breed of companies.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Penton.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
About the Author
You May Also Like