Insight and analysis on the data center space from industry thought leaders.
Overcoming Big Data’s Security Challenges with Strong Identity Management
As big data becomes more user-friendly, concerns arise around securing access to sensitive data sets and other areas of the network, writes Matthew McKenna of SSH Communications Security. These concerns must be addressed if organizations want to reap big data’s benefits without risking data breach.
January 14, 2015
Matthew brings over 10 years of high technology sales, marketing and management experience to SSH Communications Security and is responsible for all revenue-generating operations.
Big data is no longer a pipe dream. Organizations across all industries are sifting actionable insights from network data that is growing at faster and faster rates each day. Ninety percent of the world’s data has been produced in the last two years, and hidden in all that data are insights on user behavior and market trends that could never have been made otherwise. Even the White House has gotten in on the game, recently investing $200 million in big data research projects.
As big data becomes more user-friendly, concerns arise around securing access to sensitive data sets and other areas of the network. These concerns must be addressed if organizations want to reap big data’s benefits without risking data breach.
Securing M2M Identities
To run big data analytics, large data sets are split up into more manageable portions and are then processed separately across a Hadoop cluster. They are then recombined to produce the desired analytics. The process is highly automated and involves a great deal of machine-to-machine (M2M) interaction across the cluster.
Hadoop infrastructure contains several levels of authorization: access to the Hadoop cluster, inter-cluster communications and cluster access to the data sources. Many of these authorizations are based on Secure Shell keys, used within Hadoop because they are considered secure and have good support for automated M2M communication.
It’s a critical priority to secure the identities that enable access into and across the big data environment. This creates a significant challenge for those seeking to use big data analytics like Hadoop. Some of the issues are straightforward:
Who sets up the authorizations to run big data analytics?
What happens to these authorizations when the person who set them up leaves the organization?
Is the level of access provided by the authorizations based on “need to know” security principles?
Who has access to the authorizations?
How are these authorizations managed?
Big data is not the only technology dealing with such questions. They are becoming widespread across data centers as more and more business processes are automated. Over 80 percent of data center network communications are automated M2M transactions, with less than 20 percent associated with interactive user-to-machine (i.e., human) accounts. The emergence of big data as the next killer app raises the urgency of managing machine-based identities in a comprehensive way.
The Risk Curve of Inaction
High-profile data breaches involving the misuse of machine-based credentials underscore the reality of the real risk involved with ignoring M2M identities. While enterprises have made great progress in managing end-user identities, they have largely ignored the need to treat machine-based identities with the same level of care. The result is a widespread attack vector across the IT environment.
Implementing change on a running system is a challenge to the desired outcome of bringing centralized identity and access management to (potentially) millions of machine-based identities. Migrating an environment without disrupting the system in progress is a complicated undertaking, so it’s no small wonder enterprises have been hesitant to take it on.
Poor Key Management
The current state of key management is often abysmal. To manage the authentication keys used to secure M2M communications, many system administrators use spreadsheets or write homegrown scripts for controlling distribution, monitoring and taking inventory of deployed keys. This approach allows many keys to fall through the cracks. There might not be regular scanning in place either, allowing unauthorized back doors to be added without the organization knowing.
Lacking centralized control of keys undermines efforts to stay compliant. The financial industry, for example, is bound by regulations requiring strict control over who has access to sensitive data. The recently strengthened PCI standards demand that any entity that accepts payment cards—banks, retailers, restaurants and healthcare alike —do likewise. Since these industries are currently making swift and decisive moves to bolster their big data strategies to capitalize on the wave of user-driven data, they are increasingly vulnerable to compliance failures and regulatory sanctions.
Security Steps
Organizations must acknowledge and confront these risks. These steps are best practices to get them started off on the right foot:
IT staff rarely have visibility into where identities are stored, what information those identities are permitted to access and what business processes they’re supporting. Therefore, the first step is passive, non-invasive discovery.
The environment must be monitored to determine which identities are being actively used and which are not. Fortunately, in many enterprises, unused—and therefore unneeded—identities often comprise the vast majority. Once these unused identities are located and removed, the scope of the overall effort is significantly reduced.
The next step is centralized control over adding, changing and removing machine identities. This enables policy-based governance over how the identities are used, ensures no more unmanaged identities can be added and provides verifiable proof of compliance.
With visibility and control established, identities that are needed but are in violation of policy can be updated without disrupting ongoing business processes. Under central management, the privilege level assigned to that identity can be remediated.
A Secure Strategy
Big data is here to stay, along with fresh risks in data access control. M2M identity management is essential, but traditional manual IAM practices are inefficient and downright risky. Taking a complete inventory of all keys and following other best practices will save time and money while improving security and compliance. Because big data has increased access to sensitive information, organizations must take proactive measures to roll out a comprehensive and consistent identity and access management strategy.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.
About the Author
You May Also Like