written 5.7 years ago by |
In order to find out between edges, we need to calculate shortest paths from going through each of the edges.
Girvan - Newman Algorithm visits each node X once and computes the number of shortest paths from X to each of the other nodes that go through each of the edges.
The algorithm begins by performing a breadth first search [BFS] of the graph, starting at the node X.
The edges that go between node at the same level can never be a part of a shortest path from X.
Edges DAG edge will be part of at-least one shortest path from root X.
Consider the graph as follows:
Following figure shows the BFS representation starting at node E. solid edges are DAG edges and dashed edges connect nodes at the same level.
The second step of GN algorithm is to label each node by the number of shortest paths that reach it from the root.
Start by labeling the root 1, then, from the top down, label each node Y by the sum of the labels of its parents.
The labeling of nodes is shown in above figure.
The final step is to calculate for each edge e the sum over nodes Y of the fraction of shown paths from the root X to Y than go through e.
Each node other than root given a credit of 1.
This credit may be divided among nodes and edges above.
The rules for calculation are as follows:
Each leaf in DAG gets a credit.
Each non leaf node gets a credit equals to 1 plus the sum of the credits of the DAG edges from that node to the level
A DAG edge e entering node Z from the level above is given a share of the credit of Z. proportional to the fraction of shortest paths from the root to Z that go through e.
After performing the credit calculate with each node as the root, we sum the credits for each edge.
As each shortest path will have been discovered twice, we must divide the result/credit for each edge by 2.
Following graph shows the calculation for DAG starting from node E.
To complete the betweeness calculation, we have to repeat this calculation for every node as the root and sum the contributions.
After calculations, following graph shows final betweenness values:
We can cluster by taking the in order to increasing betweenness and add them to the graph at a time.
We can remove edge with highest value to cluster the graph.
In the example graph we remove edge BD to get two communities as follows: