Machine Learning of When to ‘Love your Neighbour’ in Communication Networks
May 9, 2019
Peer-to-peer networks have many advantages over routed communication networks. By gossiping messages between its neighbours, it is possible for a node to pass a message to any other node in only a handful of steps. These communications occur without needing any centralised coordination, which makes them resilient to attack since the multiple paths between the different participants allow communications to be maintained even if some nodes malfunction or are corrupted. The flexibility and robustness of peer-to-peer networks are one of the key technologies that led to the development of the internet and are also integral to blockchain protocols.
A potential weakness of decentralised peer-to-peer networks is that their flexibility can lead to fragile or inefficient network topologies, particularly if they are subjected to attack by an adversary who can manipulate connections. This is particularly relevant to blockchain protocols where blocking or delaying messages could be highly profitable for the attacker. To resolve these issues, at Fetch.AI we have developed a networking protocol that uses machine learning algorithms to optimise the connections between blockchain nodes.
This is a difficult problem as the entire network is dynamic and delays between nodes may vary greatly. The algorithm must function in a decentralised way and respond to a changing environment quickly. It must also serve the dual purposes of optimising network traffic during normal operation and preventing attacks such as partitioning (cf. Fig. 1) or an eclipse attack.
In this post, we study ways of increasing the network resistance by equipping nodes with a module that allows them to keep track of the quality of their connections. By evaluating the performance of their neighbours and by building a Bayesian trust model of the network, each node is able to:
- prioritise connections to nodes that are providing them with high-quality data (correct blocks, unseen transactions) and the best connection (high availability, low latency), thereby improving the overall speed of message propagation in the network.
- rapidly drop the connections to a suspected node if malicious behaviour is detected. This leads to malicious or spamming nodes being quickly isolated, eliminating their ability to attack or influence the network.
A lesson from chess
In the simplest form, the problem can be framed as a ranking task: how to choose an optimal set of neighbours without the need to test all possible configurations. The problem can be illustrated as follows. Consider, that you know three people, Alice, Bob and Cecil. You know the outcome of past interaction between “Alice and Bob” and between “Bob and Cecil”. How, based on that history, can we predict the outcome of an interaction between Bob and Cecil?
This is similar to a problem that arises frequently in competitive sports such as chess where we can observe the results of pairwise matches between players, and we wish to use these to estimate an ordering or ranking of players that reflect their chess-playing ability at the current time. To answer this question, in 1959, Arpad Elo developed a ranking system that modelled probability of a given outcome (win, lose or draw) in a given game as a function of a player’s rating r_i and performance p_i. The rating is a continuously valued variable that represents the player’s current ability, while the performance is modelled as a random number drawn from a normal distribution centred around the player’s rating value, namely
with some fixed variance. It then assumes that a player i wins against player j if his performance is higher, namely, p_i > p_j, where the probability of such an outcome is given by
where y is the outcome of the game (+1 if player i won, 0 for a draw and -1 if player j won). The parameter regulates how large is the update (or how important the recent experience is compared to the previously accumulated knowledge). Note also, that the positive and negative updates are equal, thus the sum of the players’ ranting is constant.
By choosing a reference value of, for example, 1000 for a new player, this method that can be used to estimate the ability of players and the likely outcome of a match between two players — even if they have never played against each other before.
Full Bayesian update
The ELO system (and its extensions) have been widely adopted in various games. However, it has several features that make it unsuitable for the network trust problem. The ELO system converges relatively slowly, as it assumes the same variance for every player. Additionally, in order to calculate a rating for different players, a global view of the system is needed, which is information that is not accessible to participants of the peer-to-peer network. This led us to introduce the following modifications:
- We perform a full Gaussian update, that modifies both the mean and variance of the distribution. This enables the algorithm to converges quicker. Intuitively, the algorithm works in such a way, that an unexpected outcome results in a larger update to the mean value. This is consistent with our common-sense: if a friend was nice to you, you don’t need to change your opinion of him or her, so a positive interaction would lead to a relatively small update. However, if the same person cheated you, you have a reason to reevaluate your prior opinion about him or her. In this case, the update to your believes how “nice your friend is” will be larger.
- In order to adapt our algorithm to the decentralised setting, each participant creates a rating of their peers and compares their interactions with a set of archetypes (“a peer that always has new data”, “a peer that never has new data”, “a peer that always responds”, “a peer that never responds to a query”, etc.). These archetypes help to establish the nodes’ expectations of their peers’ behaviour. A peer that often has new, previously unseen messages becomes closer to the “good” archetype, while a peer that rarely has information that is new or interesting to share has a decreased rating.
- Each new peer is initially assumed to have a relatively broad prior rating distribution centred close to the upper end of the scale. This can be interpreted as a node having a positive opinion of newcomers but with a high degree of uncertainty. After each interaction with a peer, nodes update their belief in the peer’s honesty (or more generally, its ‘performance’).
- If the performance of a neighbour decrease, with a certain probability, a node can replace that connection with a new node.
Finally, since the performance of different nodes may change over time (for example a previously honest node might have been bribed and started to act maliciously), a time-dependent drift is applied to each variance. This serves two purposes; first, the uncertainty in the rating of nodes that have not recently been peers is gradually increased. Second, the drift prevents the uncertainty being reduced below a certain minimal level.
We tested our approach in a simplified environment. Figure 5 presents beliefs: the solid line denotes the mean value and the shaded region represents the three-sigma uncertainty of the performance of two nodes. During the first 100 interactions, the first node (blue) shared interesting (unseen) data, while the second node (red) returned only invalid (or to put it another way, useless) data. During the second 100 interactions, the roles of the red and the blue node were swapped. Note, that the evolution of the beliefs is not symmetric. A node that was believed to be honest (blue) loses his reputation after just a few invalid messages. While the node that was believed to be bad (red) slowly starts to improve its reputation in the second half of the simulation, but cannot reach a level from which it started (the starting point was arbitrarily chosen to be 100).
Comparison with different archetypes allows us to use the same Bayesian model and obtain qualitatively different responses to different situations. While broadcasting invalid messages reduces both the rank of the node and its future ability to improve its rating, a lack-of-response should perhaps be treated more forgivingly. In Figure 6, a belief about two other nodes is presented. The first node (green) provided new (unseen) data for 50 rounds, then was offline for 50 rounds, and then returned to provide new data for the final 100 rounds. The second node (yellow) followed a similar pattern, but instead of new data, it shared something less valuable, for example, valid transactions that had already been seen.
In the next test, we simulated 11 nodes that have a different probability of sharing some useful data, from 0%, 10%, 20%, … to 90% and 100% as indicated by the labels on the vertices. Each node can query only two nodes (two incoming connections) but can share data with many nodes (unlimited outcoming connections). After each round, nodes can update their belief about incoming peers. They can also switch one connection if they believe that another node can achieve better performance.
The simulation shows that nodes quickly learn to avoid peers that provide low-quality data. This prevents, for example, a spamming node from impacting the network. As soon as nodes discover that the data is of low quality, the spamming nodes will be isolated and their messages will not be propagated.
A careful reader might ask if such a mechanism can lead to centralisation — where only a handful of nodes serve as hubs. This would be unlikely to occur in a real system, as is shown in the following simulation, as transactions and blocks originate from different (random) nodes within the network. These nodes will constantly change which of their peers is providing them with interesting (“unseen”) data. Additionally, the beliefs about the node’s rank are subjective, so depending on the connections, the same node could be considered to have a high level of performance by one of its peers — and low performance by another.
To illustrate this point we created a highly partitioned network, cf. Fig. 8.
The different partitions are denoted by contrasting colours. We run a simulation, in which a random node in the system produces a block, that is subsequently propagated through the network. Each node can maintain only two incoming connections, but an unlimited number of outgoing links. By observing which node is supplying interesting (previously unseen) blocks, nodes can build private beliefs about a peer’s performance. Periodically, nodes can choose to replace one connection if they believe it will improve their access to new data. We measured the average block propagation time and the overall cost of maintaining the internet connections (where the cost of connections between different regions was 10 times the cost of connection within a region of one colour). The results of the simulations are presented below, in Fig. 9.
This resulted in the network having a far better topology with less partitioning, and this was achieved by switching only a few connections (as you can see comparing the figures 9 and 10, the majority of links remained the same).
We believe that such a self-organising network will improve the block and messaging propagation time and will make the network more resistant to partitioning. We continue to research applications of our ranking model and we are excited to see it incorporated into our smart ledger design.