How AI will lead to self-healing mobile networks
Today we are routinely awed by the promise of machine learning (ML) and artificial intelligence (AI). Our phones speak to us and our favorite apps can ID our friends and family in our photographs. We didn’t get here overnight, of course. Enhancements to the network itself – deep, convolutional neural networks executing advanced computer science techniques – brought us to this point.
Now one of the primary beneficiaries of our super-connected world will be the very networks we have come to rely on for information, communication, commerce, and entertainment. Much has been written about the “networked society,” but on this transformative journey, the network itself is becoming a full-fledged, contributing member of that society.
AI and ML will propel networks through four stages of evolution, from today’s self-healing networks to learning networks to data-aware networks to self-driving networks.
Stage I: Self-healing networks – “I know what happened”
Today’s networks are in Stage I – a real-time feedback loop of network status monitoring and near real-time optimizations to fix problems or improve performance. The sensory systems and the network optimizations are based on human-made rules and heuristics using simple descriptive analytics. For instance, if signal A goes above threshold B for C seconds, initiate action X.
These rules are typically easy to interpret but are suboptimal to modern, data-driven alternatives because they are hard-coded, cannot adapt to changing environments, and lack the complexity to effectively deal with a wide range of possible situations. In fact, these rules are limited by the inability of the human mind, even an experienced and intelligent mind, to find all the meaningful correlations affecting network KPIs among a massive data set of influencing factors. They also don’t allow the humans responsible for network performance to anticipate trouble, making “real-time” the limiting factor to an optimally-performing network.
Stage II: Learning networks – “What will happen?”
Timing is everything. Stage II networks will continuously find patterns in past network data and use them to predict future behavior. ML can be directed to analyze factors thought to be impactful, like time/day, network events, or one-time or recurring external events or factors (e.g. an election, a natural disaster, or a trend on YouTube).
The value in the data lies in probabilistic correlations between past network performance and manual solutions that provide future optimizations. ML can capture as many correlations as model complexity allows, with data scientists and domain experts working together to best separate signal from noise, calibrating and testing ML models before they are put into production. ML models can reveal an exhaustive distribution of network KPIs and a dizzying array of external influencing factors, and then expose the subtlest of correlative relationships for the sake of predicting future outcomes.
These predictions give human overseers advanced warnings of how to distribute network resources and perform other optimizations, leading to enhanced performance at lower cost. For example, a network ‘autopilot’ could detect the slightest predicted deviations from the optimal path and issue warnings to human operators long before actual problems emerge. Continuously collecting data and comparing predictions against reality will enhance accuracy, leading to better next-gen models.
ML methods of note for Stage II include linear and non-linear supervised methods, tree-based ensembles, neural networks, and batch learning (e.g., retrain overnight). In Stage II, predictive assistance means more time for human operators to effect change, and the result is a breakthrough in network performance. Machines make predictions, and humans find solutions, with time to spare.
Stage III: Data-aware networks – “What should I do?”
The student becomes the master. By Stage III, AI algorithms review past performance and, independent of human direction, identify undiscovered correlative factors affecting future performance outside the guidance of human logic. They do so by looking beyond network data and initial guidance into external data sets such as generated and simulated data.
Machines use knowledge obtained from supervised methods and apply that knowledge to unsupervised methods, revealing undiscovered correlative factors without human intervention or guidance.
A Stage III network provides predictions of multiple possible futures and creates forecasts allowing management to predict potential business outcomes based on their own theoretical actions. For example, the network could let human managers select from a set of possible future outcomes (highest-possible performance during the Super Bowl, or lowest-possible power usage during holiday hours). Thus begins the era of strategic network optimization, with the network not only predicting a single future, but offering multiverse futures to its human colleagues. ML methods for Stage III include deep learning, simulation techniques, and other advanced computer science techniques like bandits, advanced statistics, model governance, and automatic model selection.
While highly capable, a Stage III network is still not technically “intelligent.” That grand jump towards the Singularity occurs in Stage IV.
Stage IV: Self-driving networks – “Just do it.”
I reason, therefore I am. A Stage IV network can (1) independently identify and prioritize factors of interest that impact network performance, (2) accurately predict multiverse outcomes in time for optimally executed human-effected remedies, and, most importantly, (3) distinguish between those factors that are causal vs. correlative to gain deeper insights and drive better decisions.
The distinction between causal and correlative is itself based on probabilistic analysis as seen in research. The ability of AI to establish causality is the ability to understand the root causes of network performance as opposed to the correlative signs of those causes. The ability to identify causal factors will lead to more accurate predictions and an even better-performing network. At this stage, the network gains the ability to reason cause vs. effect – and the truly intelligent network is born.
A Stage IV network can autonomously choose a course of action to maximize operational efficiency in the face of external influences. It can improve security against new incoming threats and more generally operate to maximize a given set of KPIs. The system is adaptive to real-time changes and continuously learns and improves in a data-driven context. ML methods of note for Stage IV include deep learning, reinforcement learning, online learning, dynamic systems, and other advanced computer science techniques.
Network, heal thyself
The notion of applying remedies at locally before globally is apropos in the case of AI and ML. While the world will no doubt benefit greatly from the democratization and mobilization of its ever-expanding mountain of data, it is the network and the networked society that stand to benefit the most, soonest, from our journey towards the truly intelligent machine.
Diomedes Kastanis is VP, Chief Innovation Office, at Ericsson, supporting advancement of the company’s technology vision and innovation.