Insight and analysis on the data center space from industry thought leaders.
Streaming Telemetry: Unleashing Big Data's Power in Network Management
Streaming telemetry is the first frontier in enabling Big Data-ready networks, followed by operational analytics which will consume this data to derive actions and recommendations.
June 13, 2018
Dr. Danish Rafique is Senior Innovation Manager at ADVA.
What differentiates a top-tier cloud service provider from a traditional network operator? In one word: data. It may sound counterintuitive at first, since it’s the telecom operators who sit over much larger data lakes, so what gives? While applications drive cloud service providers’ core business success, it’s the unequivocal access, smart consumption and intelligent processing of the underlying data center infrastructure―be it a physical or virtual resource―that sets them apart. With current business challenges and increasing traffic requirements, the boundaries are starting to blur between the two network segments. On one hand, traditional operators are aiming to run service-centric and app-driven business on top of their platforms, whereas cloud data centre infrastructures are expanding to metro and core connectivity solutions.
With increasingly complex end-to-end network operations and services, augmented by various digitalization initiatives, legacy network data monitoring needs to evolve beyond proprietary and inflexible interfaces. It’s rather interesting to note that, while technologies like software-defined networking (SDN) and network functions virtualization (NFV) have found their feet in the networking industry, the data monitoring ecosystem around it has stuck to the old ways of hardware-centric solutions, essentially limiting the use and purpose of dynamic and agile control and management. SNMP, for instance, is a widely used monitoring protocol, scraping data from devices using proprietary CLIs, polling devices at fixed intervals, and setting traps to trigger events. This approach is no longer scalable to today’s abstracted and dynamic networks, lacking real-time visibility into networks’ behaviour and performance. In order to harness big networking data, network monitoring needs to step up.
Enter network streaming telemetry (NEST). Network telemetry is not a novel concept per se, as it has been widely adopted for data center operations over the past decade. What’s different is the scale, model-driven monitoring, and application area, i.e. diverse geographically distributed resource pools with dynamic control and management requirements. NEST evolves beyond the traditional processes of vendor-specific monitoring, analysis and dashboard widget display to real-time open-source visibility without the blind spots. These blind spots entail events happening during the polling intervals, fixed probes unable to follow dynamic resources, and so on. Here's a look at the NEST stack core features:
Push: NEST allows devices to push data to higher layers based on subscription models. This ensures timely and very granular access to network states, statistics and other equipment-related information.
Model-driven: Typical open API discussions refer to transport protocols. What’s equally important is how the transported data is structured. NEST promotes YANG-based models for abstraction and simplification.
Big data enabler: NEST incorporates various encoding techniques (e.g. XML and GPB) to enable software communication through standard interfaces. This enables big data consumption by other tools like operational analytics, etc.
Coverage: NEST targets large network infrastructure with big data monitoring capabilities and requirements. A fundamental aspect of this goal is to remove monitoring silos. NEST allows this by synchronizing data via physical and virtual monitoring probes across the network.
Real-time: With its subscription-based data access and programmability, NEST allows real-time SDN-driven and policy-based monitoring functionalities.
The question remains as to how can a network operator leverage such an advanced monitoring framework, and why exactly? In short, what are the key use cases driving this technological innovation? Let’s consider a couple of networking application scenarios to address this point:
Fault management is a fundamental aspect of network operations, involving fault detection, diagnosis and eventual action recommendations. A typical fault detection to resolution cycle takes days to weeks. Streaming telemetry would enable machine-learning-based real-time data analytics, improving these time scales by at least an order of magnitude.
Capacity planning is typically carried out offline based on highly engineered design rules. The downside is the lack of real-time optimization opportunities, leading to resource underutilization. With a streaming view of network states and traffic behavior, real-time physical and virtual resource optimization would enable substantial throughout benefits, on top of OPEX savings.
Once the architectural aspects and applications are clear, the implementation discussions evolve in several directions: Which specific models should be uses? Which encoding is the best? Are there any standardized frameworks? While most of the industry supports YANG models as a basis for data representation, high layer interface choices may be driven by solution providers’ strategies and underlying use-cases. So long as they follow the industry-standard stack, they would be aligned with the vision of NEST-enabled intelligent control and management.
There is clearly an appetite among network operators―telecom and cloud alike―for real-time network insights driven by advanced machine learning tools. The trend is fundamentally driven by multi-domain and multi-layer intelligent orchestration of diverse networking infrastructure. Streaming telemetry is the first frontier in enabling such big-data-ready networks, followed by operational analytics which will consume this data to derive actions and recommendations. The consequent programmatic changes must be verifiable via closed-loop interactions of telemetry, big data analytics, and the management and control plane. The challenges, however, remain with respect to data overdose and security aspects, and care must be practiced in selecting a telemetry architecture which securely caters for the rapidly evolving networking stack, together with right tools for data management.
Opinions expressed in the article above do not necessarily reflect the opinions of Data Center Knowledge and Informa.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating.
.
About the Author
You May Also Like