For my master’s thesis I evaluated whether AI and metaheuristic routing algorithms implemented in Software Defined Networks can improve the overall performance of computer networks. Curious whether OSPF and BGP are getting shelved soon? Find out here!
The internet might be a daily commodity for you and me, but for a large amount of people every year, it’s a completely new experience. About 4.3 Billion people are connected to the internet in 2019, with about 360 Million getting newly introduced to it every year.
While that in itself is great news, it raises new challenges on how to manage the network traffic generated. The growth of the internet is also reflected by the number of data centres being built worldwide.
Looking at the data, I wanted to evaluate whether there is a way to better utilize existing networks and servers by making them more efficient. Maybe we could use more intelligent routing algorithms to improve the performance of our networks?
Software Defined Networks
Software Defined Networks (SDN) is a new paradigm on how to do networking. In traditional computer networks, your network is defined by the hardware used and the cabling that connects your machines. Routers and switches are configured individually and conduct their decisions on what to do with network packets by themselves. It’s like a distributed web of little machines, each one thinking for itself.
The main feature of Software Defined Networks is something called the SDN controller, a piece of software that directs the network devices in the system. For example, an SDN controller can install forwarding rules into routers, gather statistics about the traffic and update hardware configurations on the fly. So instead of having dozens or even hundreds of routers thinking for themselves, one central software is thinking and managing them all.
Not only is this great for managing the complexity of the network, it also efficiently allows us to program behaviour directly into our network. This fine degree of control over the behaviour of computer networks is still new and gives us the ability to do more complex operations than ever before.
Nowadays, routing is pretty simple—you just take the shortest path. Standard algorithms like OSPF and BGP all work with the same underlying principles. Shortest Path Routing is a simple policy that works great in large networks like the internet.
Since only a tiny part of the network is under your direct control, it’s impossible to tell which path is the best, so you might as well take the shortest one. While this is a good choice for the general internet, what if the whole network is completely controlled by your organization?
The underlying hypothesis behind this research project is the following: “if we had more information about our network, we could do more informed routing decisions“. Using the functionality of Software Defined Networks, we can monitor the full state of the network, such as current bandwidth rates and latencies on each link. This additional information and the possibility to program complex behaviour into networks opened a new field of research: experimental routing algorithms.
Reinforcement Learning Router
The Reinforcement Learning Router makes its decisions via experience values. In my implementation of the algorithm, the network state gets reduced to a few core values: a matrix of low, medium or high latency and bandwidth for each network link.
On a new routing request, the routing algorithm will try to either explore new possibilities and pick a random route through the network or exploit known values and pick the route with the highest expected performance. Routes are chosen from a set of candidate routes generated for each routing calculation. This set contains reasonable routes, so only routes that are not overly complicated or inefficient. After the routing algorithm makes a decision, a reward function rates the quality of the decision based on the overall network performance and updates the known experience values.
Ant Colony Optimization Router
The Ant Colony Optimization Router works by interacting directly with the network. When a new routing request arrives at this algorithm, it will send artificial ant probes through the network to reach the destination. This works by sending multiple generations of ants through the network that communicate with future generations via pheromones—just like in ecology. The ant probes will either randomly walk through the network to find a new path between the start and the destination of the route or follow an already established path.
After each generation the path that took the least amount of time strengthens its pheromone trail, increasing the likelihood that future ant generations will follow it. This ensures that established “good“ paths are strengthened, while still trying to explore other, possibly better, alternatives. After all generations are finished, the route with the strongest pheromones chosen wins!
After implementing these algorithms and a Shortest Path Routing algorithm within an SDN controller, I conducted a benchmarking study in simulated networks. I used Mininet to emulate Data Centre Networks and Wide Area Networks and simulated realistic network traffic by replaying pcap packet capture files. The experimental algorithms and a Shortest Path Routing algorithm then run through a series of experiments on different network topologies and traffic intensities. While the SDN controller regularly recalculates routes and tries to adapt to the traffic situation in the network, it also uses the built-in mechanisms of Software Defined Networks to raise statistics about the network.
As it turns out, Shortest Path Routing is not yet dethroned as the leading routing algorithm.
Evaluating the results of my benchmark, Shortest Path Routing consistently has the lowest latencies and achieves the highest average end-to-end bandwidth
However, it’s not a flawless victory for Shortest Path Routing. Ant Colony Optimization achieves about 10% more bandwidth on very high traffic intensities and Shortest Path Routing has severe issues in handling traffic spikes, as it can’t react dynamically to increasing traffic.
However, I sadly can not recommend any of the experimental algorithms. Ant Colony Optimization suffers from extremely high calculation times for routes. As soon as the network starts to get congested, the ant probes can not reach their destination in a timely manner anymore. The few metrics in which it excels compared to Shortest Path Routing is, in my opinion, not worth the tradeoff.
The Reinforcement Learning Router was my big hope for this experiment, however, it just didn’t perform well enough on any of the chosen metrics.
Another factor is simply the ease of implementation. Shortest Path Routing is implemented in just a few lines, while the alternative algorithms require complex logic that needs to be maintained. In my opinion, this is a case where the simplest option is still the best.
While my results indicate that Shortest Path is still the most performant algorithm in my experiments, it also shows that some room for improvements exists. I think it’s only a matter of time until a more intelligent routing algorithm can fully utilize the state of the network to conduct better routing decisions, but for now let’s take a step back and let the tried and tested algorithms do their work.