Having previously tried to use force-directed layout algorithms on large networks, I was very intrigued by
Stefan Hachul and Michael Junger’s article Large Graph-Layout Algorithms at Work: An Experimental Study. In my experience, trying to generate a layout for a large graph results in little more than a hairball and the sense that one really ought to focus on just a small subgraph.
With the recent development of increasingly sophisticated layout algorithms, Hachul and Junger compare the performance of several classical and more recent algorithms. Using a collection graphs – some relatively easy to layout and some more challenging – the authors compare the runtime and aesthetic output.
All the algorithms strive for the same aesthetic properties: uniformity of edge length, few edge crossings, non-overlapping nodes and edges, and the display of symmetries – which makes aesthetic comparison measurable.
Most of the algorithms performed well on the easier layouts. The only one which didn’t was their benchmark Grid-Variant Algorithm (GVA), a spring-layout which divides the drawing area into a grid and only calculates the repulsive forces acting between nodes that are placed relatively near to each other.
For the harder graphs, they found that the Fast Multipole Multilevel Method (FM3) often produced the best layout, though it is slower than High-Dimensional Embedding (HDE) and the Algebraic Multigrid Method (ACE), which can both produce satisfactory results. Ultimately, Hachul and Junger recommend as practical advice: “first use HDE followed by ACE, since they are the fastest methods…if the drawings are not satisfactory or one supposes that important details of the graph’s structure are hidden, use FM3.”
What’s interesting about this finding is that HDE and ACE both rely solely on linear algebra rather than the physical analogies of force-directed layouts. FM3, on the other hand – notably developed by Hachul and Junger – is force-directed.
In ACE, the algorithm minimizes the quadratic form of the Laplacian (xTLx), finding the eigenvectors of L that are associated with the two smallest eigenvalues. Using an algebraic multigrid algorithm to calculate the eigenvectors makes the algorithm among the fastest tested for smaller graphs.
By far the fastest algorithm was HDE, which takes a really interesting, two-step approach. First approximating a high-dimensional k-clustering solution and then projecting those clusters into 2D space by calculating the eigenvectors of the covariance matrix from the clusters. The original paper describing the algorithm is here.
Finally, the slower but more aesthetically reliable FM3 algorithm improves upon classic force-direct approaches by relying on an important assumption: in large graphs, you don’t necessarily have to see everything. In this algorithm, “subgraphs with a small diameter (called solar systems) are collapsed” resulting in a final visualization which captures the structure of the large network with the visual ease of a smaller network.