Representing the Structure of Data

To be perfectly honest, I had never thought much about graph layout algorithms. You hit a button in Gephi or call a networkx function, some magic happens, and you get a layout. If you don’t like the layout generated, you hit the button again or call a different function.

In one of my classes last year, we generated our own layouts using eigenvectors of the Laplacian. This gave me a better sense of what happens when you use a layout algorithm, but I still tended to think of it as a step which takes place at the end of an assignment; a presentation element which can make your research accessible and look splashy.

In my visualization class yesterday, we had a guest lecture by Daniel Weidele, PhD student at University of Konstanz and researcher at IBM Watson. He covered fundamentals of select network layout algorithms but also spoke more broadly about the importance of layout. A network layout is more than a visualization of a collection of data, it is the final stage of a pipeline which attempts to represent some phenomena. The whole modeling process abstracts a phenomena into a concept, and the represents that concept as a network layout.

When you’re developing a network model for a phenomenon, you ask questions like “who is your audience? What are the questions we hope to answer?” Daniel pointed out that you should ask similar questions when evaluating a graph layout; the question isn’t just “does this look good?” You should ask: “Is this helpful? What does it tell me?”

If there are specific questions you are asking your model, you can use a graph layout to get at the answers. You may, for example, ask: “Can I predict partitioning?”

This is what makes modern algorithms such as stress optimization so powerful – it’s not just that they produce pretty pictures, or even that they layouts appropriately disambiguate nodes, but they actually represent the structure of the data in a meaningful way.

In his work with IBM Watson, Weidele indicated that a fundamental piece of their algorithm design process is building algorithms based on human perception. For a test layout, try to understand what a human likes about it, try to understand what a human can infer from it – and then try to understand the properties and metrics which made that human’s interpretation possible.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.