defining and taxonomizing alluvial diagrams - murmuring in the background

Background

I first encountered alluvial diagrams, so-called, in a widely-shared paper by Martin Rosvall and Carl T. Bergstrom. They were investigating the shifting boundaries between distinct scientific fields (“modules”), as reconstructed from sequential years of journal citation data (“states”), and proposed a specialized flow diagram “to highlight the significant changes, fusions, and fissions that the modules undergo between each pair of successive states”. At the time, i was unfamiliar with Sankey diagrams and only passingly familiar with flow diagrams—mostly by way of my computational biologist colleagues and their literature-derived signal transduction networks—but i could imagine myriad uses for this type of visualization and eventually went searching for an implementation in R.

This led me to the alluvial package, which Michał Bojanowski was actively developing. As i got more comfortable with and excited about the then-ascendant tidyverse, i took it upon myself to put together a ggplot2 extension, which has since become considerably widely used. I later learned that previous developers had released similar extensions, specifically “parallel sets plots” in Thomas Lin Pedersen’s ggforce and in Heike Hofmann and Marie Vendettouli’s ggparallel.¹ With continual exposure to these diverse implementations, i began to notice some subtle but important distinctions between others’ and my design principles. It also became clear that there was no consensus distinction between alluvial diagrams/plots and the more well-established genres of Sankey diagrams and parallel sets plots—indeed, no consensus on whether a distinction existed! Based on the ggalluvial issues page and an occasional query on Stack Overflow, the lack of generally accepted terms surrounding these sorts of diagrams remains a source of confusion.² My goal in this post is to propose some vocabulary and a taxonomy to ~~alluviate~~ alleviate this confusion, or at least serve as a point of reference for others to propose improvements!

A proposed taxonomy for width-encoded diagrams

Since the terms “diagram” and “plot” (along with “chart”) are sometimes used interchangeably and sometimes fiercely contested, i’ll adopt a convention here and invite suggestions to improve it: Diagrams visualize information, charts are diagrams whose information is stored as data, and plots are charts that are uniquely determined from data by a fixed set of plotting rules.³ In these terms, ggalluvial and the other R packages discussed here unambiguously produce plots.

Here are how i think these various diagrams—flow, Sankey, parallel sets, and alluvial—are related:

Flow diagrams encode directed flows. Flow diagrams may use ribbons, arrows, or other graphical elements to represent flows—that is, directed processes. For example, directed network diagrams of resource transmission between nodes are flow diagrams. Note that flows are not necessarily between nodes: Many Sankey diagrams include incoming or outgoing arrows representing flows from or to elements outside the diagram.
Sankey diagrams are flow diagrams with flow weights encoded as ribbon widths. Flow diagrams may be unweighted: Signal transduction networks, for example, usually are. While Sankey diagrams are extremely flexible, their defining characteristic is that weighted flows are represented by ribbons whose widths indicate the weights or “volumes” of flows through them.
Parallel coordinates plots depict cases in a data set by their coordinates along several continuous dimensions, i.e. their values at several continuous variables, arrayed along a discrete axis. Parallel sets plots are analogous to parallel coordinates plots with discrete-valued classificatory dimensions in place of continuous-valued coordinate dimensions. This requires that the plotting rules both determine the order of the classes along each dimension and preserve their relative sizes. In practice, this means that ribbon widths indicate either the absolute weights of the cases or their proportions of the total weight.
Alluvial plots are parallel sets plots in which classes are ordered consistently across dimensions and stacked without gaps at each dimension. This yields a plot with a meaningful continuous axis perpendicular to the discrete axis: The height of a stack at any class is the cumulative weight of the preceding classes, and the stacked sets at different dimensions can be directly compared as stacked bar plots. (Alluvial plots also tend to use splines rather than segments to delineate ribbons, though i think this is far less important than the rules that position the sets and ribbon boundaries.)

Examples

I’m not rendering any plots in this post, so i’ll illustrate these distinctions by pointing to several examples of each, with an emphasis on examples that are “mislabeled” according to my taxonomy:

Flow diagrams that are not Sankey diagrams:
- Typical Signal Schedule and Traffic Flow Diagram, Wikimedia Commons
- VA Business Line Finance and Accounting, Wikimedia Commons
- Process flow diagram, Alonso et al (2017)
Sankey diagrams:
- The Thermal Efficiency of Steam Engines, Wikimedia Commons
- Sankey Diagram of US Consumer Expenditure in 2012, Wikimedia Commons
- Earth heat balance Sankey diagram, Wikimedia Commons
- Sankey diagram, Alonso et al (2017)
Parallel sets plots that are not alluvial plots:
- Titanic Survivors, Jason Davies
- Stacked area alluvial diagram, Xenographics (note that the vertical axis applies only to the area plot)
- Sankey Diagram - Income Statement, Wikimedia Commons
- Mapping change in science, PLoS ONE
- Sankey diagram on global green virtual water flows, Serrano, Guan, Duarte, and Paavola (2016)
Alluvial plots:
- Superstore’s Super Sankey, The Information Lab
- How Scotland’s political geography changed, seat by seat, The Guardian
- Alluvial diagram for mapping changes in the Global network, Lu and Brelsford (2014)

How my proposal stacks up

Without exhaustively surveying the Internet and technical literature, it’s worthwhile to benchmark my distinctions against those made by some popular chart catalogues, which are much more representative of usage patterns. Xenographics lists several at the end of their discussion of graphonyms, and of these three make clear distinctions between some of the types described above:

The DataViz Project describes Sankey diagrams as i did above, and distinguishes alluvial plots from parallel sets plots only in terms of the orientation of the axes (which is horizontal versus vertical) and of the shapes of the connecting ribbons.
The Data Visualization Catalogue distinguishes parallel sets plots from Sankey diagrams as not using arrows and binning the flows (ribbons) at regular intervals. (They don’t have an entry on alluvial plots.)
The Visualization Universe has entries for Sankey diagrams and alluvial plots, but their descriptions are excerpts (or, at least, subsets) of those of the DataViz Project.

Importantly, in my taxonomy, by and large, alluvial plots are not Sankey diagrams, nor even flow diagrams. This seems appropriate on reflection, since even the original alluvial diagrams did not represent the transmission of material or information between nodes but changes in classification over time.⁴ This is not to say that alluvial plots cannot represent flow data—several popular examples do—but that the plot elements are not specific to flow data; and that, as a result, the directedness of flow data may not be conveyed well in an alluvial plot.⁵ I seem to be in agreement with the popular catalogues here.⁶

A more subtle pattern of usage reflected in my proposal is that parallel sets plots may resize flows from dimension to dimension to reflect changes in proportion that do not necessarily correspond to chances in amount. This must be done carefully in such plots, but it would be anathema to a Sankey diagram. I’ll also endorse Elijah Meeks’ point, from this Medium post, that Sankey diagrams can include cycles, whereas parallel sets and alluvial plots would not be able to encode such patterns.

I’m also making a distinction between alluvial and parallel sets plots that the catalogues—and, in my experience, other developers—don’t make. Indeed, i haven’t seen a discussion anywhere else of whether a parallel sets or alluvial plot puts repeated categories in the same order, or whether the parallel sets are stacked so that the perpendicular axis measures their cumulative size, or whether either of these features is relevant to the choice of which type of plot is better-suited to a given purpose. Having spent hours on the problem of whether different orderings might benefit a plot, and having been prompted several times to implement gaps between the sets, i’ve come to take the strong position that this distinction matters and that the terminology should reflect it. While good definitions describe usage rather than prescribe it, i think it’s worth making an argument for this particular technical distinction while the terminology has not yet, ahem, sedemented.

A prescription for distinguishing alluvial and parallel sets plots

To reiterate:

Alluvial plots are parallel sets plots in which classes are ordered consistently across dimensions and stacked without gaps at each dimension.

Caveat

Neither Rosvall and Bergstrom, who popularized alluvial diagrams, nor Bojanowski, on whose package i based ggalluvial, included a cumulative weight axis perpendicular to the dimensions axis. In fact, what originally prompted me to omit the gaps between strata in ggalluvial was that i didn’t know how to get rid of the vertical axis! Like every great idea i believe i’ve had, i arrived at this one via gradient descent.

That’s still not to say i was first: Hofmann and Vendettouli wrote ggparallel to stack the sets in each dimension and retain a vertical axis in their plots. There may well be other such implementations, but i’m most familiar with the R ecosystem.

Utility

First, i claim that the features that distinguish alluvial from parallel sets plots—consistent ordering of sets and a cumulative weight axis—have practical importance. In particular, they have importance beyond the ability to visually distinguish the sets and compare their weights along each dimension, as can be done from any parallel sets plot. The use cases i’ve surveyed reveal three distinct settings in which alluvial plots are superior to other parallel sets plots:⁷

Repeated categorical measures data: Several users have used alluvial plots to represent data consisting of partitions of cases into the same (or overlapping) classification schemes at different times, in particular before and after some intervention or other significant event. See scholarly examples in Baudron, Ndoli, Habarurema, and Silva (2019), Kissinger and Reznik (2019), Muenchow, Schäfer, and Krüger (2019), Chong et al (2019), and Schlotter et al (2019), and tweeted examples by KenSteif, frau_dr_barber, ericpgreen, and 5amStats (starboard image). For these diagrams to communicate the data efficiently, it is essential that the classes be consistently ordered. Some implementations of parallel sets plots default to this behavior, as showcased by Matt Harris; but others may automatically sort the sets by size, and still others allow arbitrary orderings—which may be useful for interactive exploration, as with Meeks’ implementation, but inappropriate for static renderings.

Multipartite network data: A handful of users have used alluvial plots to represent multipartite graphs, which necessarily satisfy the constraint that the total degree of the nodes in each part is the same. In particular, genomic analyses produce one-to-one connections among stages in transcription processes (lncRNA, miRNA, and mRNA), which have been encoded into alluvial plots by Zheng et al (2018), Long et al (2019a), and Long et al (2019b). See Vazquez Bernat et al (2019) for a similar usage, and MyriamCTraub and BenMoretti on Twitter. A similar principle is at work in Watanabe Smith’s illustration of ranked-choice voting, especially with respect to those occasions when a voter lost influence by only ranking a few of the options (which segues into the next setting). While these users excluded the cumulative weight axis, through the fixed heights of the stacked sets the plots communicate that the total degree at each stage is the same.

Censored data: Finally, something i think alluvial plots do exceptionally better than general parallel sets plots is accentuate censoredness in data. Check out scholarly articles by Seekatz et al (2018) and Sjoding, Gong, Hass, and Iwashyna (2019) and David Neuzerling’s reflections on the job hunt, which use alluvial plots to depict changes in subjects’ status across several time points or stages with a specific set (or blank space where it would be) for subjects who became unavailable later in the study. The cumulative weight axis and gridlines allow the reader to immediately discern the reduction in sample size at each step. (Though they use network data, Lu and Brelsford (2014) make similar use of this property to visualize non-connections between sets together with connections.)

That’s the substance of my argument for the alluvial–parallel sets distinction, but i’ll finish with a bit of fluorish.

Connotativity

The special features of alluvial plots are connoted by their peculiar terminology: I’ve decided to call the rectangles representing the parallel sets “strata” to suggest that they are more stable than the crisscrossing alluvia, and indeed this stability (with respect to their order in the plot) is what users expect when many of the categorical dimensions classify subjects into the same categories. The term also suggests, as does “alluvia”, that the various sets into which the subjects are partitioned at each (usually horizontal) position along the dimension axis are themselves (vertically) positioned in accordance with gravity. That is, they have “settled” one atop another with no defiantly empty space in between.

Thus i deposit my case.

Coda

I am by no means an expert in data visualization! While i feel strongly that my taxonomy makes the best of the present scrambling of terms, i am quite open to countersuggestions and especially to use cases that undercut it. If you come across them, please do send them my way.

I only just discovered Yawei Ge and Hofmann’s ggpcp package for general parallel coordinate plots, under active development!↩
That’s not to say that useful distinctions haven’t been made somewhere, e.g. the technical literature on data visualization, but i feel confident in claiming that they have had limited effects on practice!↩
In ggplot2, these rules are the stat, geom, coord, and scale layers. One of the great contributions of ggplot2, in my view, was to make them explicit to lay users like myself.↩
It’s unfortunate that i named the ribbons between adjacent axes flows in ggalluvial, but i maintain that it was preferable to calling them fans.↩
A similar misfit is a simplicial complex represented by a network diagram: The type of diagram is designed for a different type of data (pairwise-relational with possible directedness and multiplicity) and fails to convey essential data elements (higher-dimensional simplices).↩
A contrary example is RAWGraphs, which describes alluvial plots as “a specific kind of Sankey diagrams”.↩
I found most of these examples by searching for “ggalluvial” in Google Scholar or Twitter.↩