Why graph
Network visualisation is non-trivial; indeed it is very important for at least two reasons.
First, visualisation is a crucial part of the process of data analysis. As a first step, network visualisation – or graphing – offers us a way to vett our data for anything strange that might be going on, both revealing and informing our assumptions and intuitions. The following image relates to the famous Anscombe’s quartet, which shows how different datasets can have identical statistical properties that are only revealed to be very different when graphed.

As Tufte (1983: 9) said:
“At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers”
All of this is crucial with networks. Drawing network graphs is key to exploring and understanding both the global structure of a network as well as smaller-scale structures such as nodal positions or communities within it.
Second, visualisation is a crucial part of the communication of the lessons that we have learned through investigation with others. As Brandes et al (1999) argue, visualisation involves thinking about the substance of what you are trying to communicate, how to design it so that it is ergonomic and (ideally) aesthetic, and which algotihm is most appropriate to lay out the graph informatively. The aim is to offer a concise and precise delivery of insights.
There may be some dead-ends and time-sinks involved in visualising your data, but it is worth taking the time to explore your data and experiment with ways to make what you have learned over a longer period of time evident to others in a shorter period of time.
Different graphics approaches in R
To understand graph and network metric visualisation with
{autograph}, it is useful to review the different
approaches taken already in R. There are several main packages for
plotting in R, as well as several for plotting networks in R. Plotting
in R is typically based around two main approaches:
- the ‘base’ approach in R by default, and
- the ‘grid’ approach made popular by the famous and very flexible
{ggplot2}package.1
In the case of base R graphics, plots are essentially written straight to the plotting device. This means that they are not easily modified after the fact. You would need to replot the whole thing to change something. Moreover, while there is an admirably clean aestethic to base R graphics, it can be difficult to modify or extend them to your needs.
In the case of grid graphics, plots are built up in layers, and thus
can be modified after the fact. That is, you can initialise a plot using
ggplot2::ggplot(), specifying the data and mapping
variables to various aesthetic features, and then add layers to it using
+ to add further points and lines, but also titles,
legends, etc.
The following figure illustrates the difference between these two approaches.
plot(mtcars$hp, mtcars$mpg,
main = "Base R: MPG vs Horsepower",
xlab = "Horsepower",
ylab = "Miles per Gallon",
pch = 19,
col = "blue")
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point(color = "blue") +
labs(title = "ggplot2: MPG vs Horsepower",
x = "Horsepower",
y = "Miles per Gallon")
Perhaps of interest,
ggstands for the Grammar of Graphics (https://doi.org/10.1007/0-387-28695-0).↩︎
Different graphing approaches in R
Approaches to plotting graphs or networks in R can be similarly divided:
- two classic packages,
{igraph}and{sna}, both build upon the ‘base’ R graphics engine, - newer packages
{ggnetwork}and{ggraph}build upon a ‘grid’ approach.2
In this tutorial, we’re going to use the fict_lotr
dataset from the {manynet} package. Let’s see how this
would be plotted using {igraph} and {ggraph},
adding a title to each to facilitate comparison, but otherwise relying
on default behaviour.
plot(as_igraph(fict_lotr),
main = "igraph: fict_lotr")
ggraph::ggraph(as_tidygraph(fict_lotr)) +
ggraph::geom_edge_link() +
ggraph::geom_node_point() +
ggtitle("ggraph: fict_lotr")
We can see here that {igraph} plots the network in a
fairly basic way, straight to the plotting device (window). By default,
it uses a force-directed layout (see Layouts below),3 colors the nodes orange, and prints
node labels if they have them. However, the layout is not optimised for
the size of the plotting window, the node labels are regularly
overlapping, and the orange color with black borders is not particularly
appealing or helpful for label legibility. It only works with ‘igraph’
objects.
In contrast, {ggraph} offers the trademark flexibility
of the grammar of graphics approach. However, it requires the user to
build up a plot from the ground up, which can be daunting for new users
and fiddly even for experienced ones. Four lines are required to get
even a basic plot, with an additional line required if a grey background
is not desired. No labels or other information are added by default, and
would also require additional lines. It works with ‘tidygraph’ objects,
which are an additional layer on top of ‘igraph’ objects.
Others include: ‘Networkly’ for creating 2-D and 3-D interactive networks that can be rendered with plotly and can be easily integrated into shiny apps or markdown documents; ‘visNetwork’ interacts with javascript (vis.js) to make interactive networks (http://datastorm-open.github.io/visNetwork/); and ‘networkD3’ interacts with javascript (D3) to make interactive networks (https://www.r-bloggers.com/2016/10/network-visualization-part-6-d3-and-r-networkd3/).↩︎
Which incidentally returns a different layout each time it is run.↩︎
Who draws first
{autograph} builds upon these packages, but takes a
somewhat different approach. Because it builds upon the ‘grid’ approach
of {ggplot2} and {ggraph}, lending itself to
the additional layering and flexibility of those packages. Because it
depends on the coercion routines available in {manynet}, it
can be used with network-related objects from most common network
analysis packages. However, unlike those packages it offers concise and
easy-to-use functions for drawing graphs that offer sensible defaults
for most common use cases, using the information that is available in
the network object.
The first thing you will want to do when you import or create a new
network dataset is draw it. Let’s say that we’re still interested in the
fict_lotr dataset from the {manynet} package.
Compared to the {igraph} and {ggraph} examples
before, we can see that autograph::graphr() offers a much
more concise way to draw the network.
graphr(fict_lotr)
The package also offers methods for plotting statistics related to
networks (e.g. degree distributions) and models of them
(e.g. goodness-of-fit plots). But we’ll get to that later, and most of
these are demonstrated in other vignettes, tutorials, or packages.
{autograph} also offers consistent theming across graphs
and plots, so that you do not need to keep specifying the same options
over and over again.
In the following pages, we’re going to go through a number of different ways of taking control of the graphing process. Click ‘Next Topic’ to continue.
Illustrating graphs
Once we have an initial graph of our network, we can start to explore features of the network and its structure in more detail. There are a number of different dimensions network researchers can play with to illustrate different aspects of the network. On her excellent and helpful website, Katya Ognyanova outlines some of these dimensions:
| Nodes | Ties | ||
|---|---|---|---|
| Position | layout= |
Arrows | (e.g. capped, head shape) |
| Labels | labels=, node_group= |
Type | (e.g. solid, dashed) |
| Shape | node_shape= |
Shape | (e.g. straight, bent) |
| Size | node_size= |
Size | edge_size= |
| Color | node_color=,
node_colour= |
Color | edge_color=,
edge_colour= |
Currently only those options with named parameters in the table above
are available to be customised in {autograph} at the
moment. Tie arrows and shapes are used to indicate directionality and
reciprocity, where present in the data. Let’s go through some of these
options in more detail.
Shaping nodes
One of the first things we might be interested in doing is
understanding better the distribution of some categorical variable. Our
fict_lotr dataset contains a variable called
Race, so let’s try and change the shape of the nodes by
this variable. Following the syntax shown in the table above, we just
need to reference the variable name in the node_shape
argument.
fict_lotr
graphr(fict_lotr, node_shape = "Race")
We can see here that there are six different races present here.4 Unfortunately, this is a few too many different categories to be effectively distinguished by shape.
Though the keen-eyed and well-read among you will have noticed that there are some racial assignments that are debatable.↩︎
Colouring nodes
Let’s try instead colouring the nodes by this “Race” variable. It is very similar to the shape example above. Can you try and complete the code yourself?
graphr(fict_lotr, node_color = "Race")
That’s much easier to read. Since the same colours seem to be clustered together, with the humans and hobbits each clustered together in the centre of the graph, and the elves clustered towards the left, we might infer that there is some ‘homophily’ going on here – a topic for another tutorial.
An alternative to coloring the nodes is to use the ‘node_group’ argument to highlight groups in a network. This puts a shaded area around nodes of the same group. For rather spatially clustered distributions, this can be a very effective way to show groupings, but can be sensitive to the layout used. If nodes of the same group are not close together, the shaded areas can overlap and make the graph harder to read.
graphr(fict_lotr, node_group = "Race")
Note that node_color and node_group can be
used together, either to highlight different groups or to emphasise
group assignment where there is the kind of interpenetration or overlap
described above as a challenge.
Sizing nodes
What about if we’re interested in a continuous variable instead of a
categorical variable? While the fict_lotr dataset does not
contain any continuous nodal variables, we can create one rather easily
from the network itself. Let’s use each node’s degree, which is the
number of ties incident/connecting to the node.
fict_lotr %>%
mutate(Degree = node_deg(fict_lotr)) %>%
graphr(node_size = "Degree")
Tying up loose ends
All this works similarly with ties/edges. Just replace
node_ with edge_ in the arguments above, and
you can control edges’ size and color. Have a try yourself by adding
some additional variables to the data. Try adding in a binary variable
to each tie called ‘is_tri’ that indicates whether the tie is a part of
a triangle or not. If you add a continuous variable to each tie called
‘weight’, and a categorical variable to the ties called ‘type’, then
graphr() will even try to use this information
automatically.
fict_lotr %>%
mutate_ties(weight = tie_closeness(fict_lotr),
is_tri = tie_is_triangular(fict_lotr)) %>%
graphr(edge_color = "is_tri")
Theming
Setting a theme
Perhaps you are preparing a presentation, representing your
institution, department, or research centre at home or abroad. In this
case, you may wish to theme the whole network with institutional colors
and fonts. Indeed, you may even want to set a theme that is then reused
across all your graphs and plots. {autograph} offers a
number of themes that can be set using the stocnet_theme()
function.
stocnet_theme("default")
graphr(fict_lotr, node_color = "Race")
stocnet_theme("iheid")
graphr(fict_lotr, node_color = "Race")
More institutional scales and themes are available, and more can be implemented upon pull request.
Who’s hue?
By default, graphr() will use a color palette that
offers fairly good contrast and better accessibility. However, a
different hue might offer a better aesthetic or identifiability for some
nodes. Because the graphr() function is based on the
grammar of graphics, it’s easy to extend or alter aesthetic aspects.
Here let’s try and change the colors assigned to the different races in
the fict_lotr dataset.
graphr(fict_lotr,
node_color = "Race")
graphr(fict_lotr,
node_color = "Race") +
ggplot2::scale_colour_hue()
Grayscale
Other times color may not be desired. Some publications require
grayscale images. To use a grayscale color palette, replace
_hue from above with _grey (note the ‘e’
spelling):
graphr(fict_lotr,
node_color = "Race") +
ggplot2::scale_colour_grey()
As you can see, grayscale is more effective for continuous variables or very few discrete variables than the number used here.
Manual override
Or we may want to choose particular colors for each category. This is
pretty straightforward to do with
ggplot2::scale_colour_manual(). Some common color names are
available, but otherwise hex color codes can be used for more specific
colors. Unspecified categories are coloured (dark) grey.
graphr(fict_lotr,
node_color = "Race") +
ggplot2::scale_colour_manual(
values = c("Dwarf" = "red",
"Hobbit" = "orange",
"Maiar" = "#DEC20B",
"Human" = "lightblue",
"Elf" = "lightgreen",
"Ent" = "darkgreen")) +
labs(color = "Color")
Titles, labels, and legends
When it comes to communicating insights from network graphs to others, it is important to add in the contextual information that will help them understand what they are looking at. In this section, we will learn how to add titles, labels, and legends to graphs.
Labels
With our fict_lotr example above, because the network is
itself labelled, graphr() automatically adds in the node
labels because they are available. If you do not want these labels, you
can remove them from the network before passing it on to
graphr(), or you can use the argument
labels = FALSE.
graphr(fict_lotr, labels = FALSE)
Without the labels, the structure of the network is clearer and easier to interpret, though we lose the information about which node is which character.
Titles
{autograph} works well with both {ggplot2}
and {ggraph} functions that can be appended to create more
tailored visualisations. Let’s try this by adding a title to a plot.
Append (with a +) labs(title = ) to add a
title to a plot, say “My graph”, and then add also a subtitle (an
argument to that function), say “I did this”.
graphr(fict_lotr) +
labs(title = "My visualisation",
subtitle = "I did this")
Note that you can also use ggtitle() to do the same
thing, but if you just remember labs() you can also use it
to add labels for x and y axes, and legends (see
below).
Legends
While {autograph} attempts to provide legends where
necessary, in some cases the legends offer insufficient detail, such as
in the following figure, or are absent.
fict_lotr %>%
mutate(maxbet = node_is_max(node_betweenness(fict_lotr))) %>%
graphr(node_color = "maxbet")
{autograph} supports the {ggplot2} way of
adding legends after the main plot has been constructed, using
guides() to add in the legends, and labs() for
giving those legends particular titles. Note that we can use
"\n" within the legend title to make the title span
multiple lines.
fict_lotr %>%
mutate(maxbet = node_is_max(node_betweenness(fict_lotr))) %>%
graphr(node_color = "maxbet") +
guides(color = "legend") +
labs(color = "Maximum\nBetweenness")
To change the position of the legend, add the theme()
function from {ggplot2}. The legend can be positioned at
the top, bottom, left, or right, or removed using “none”.
Layouts
The aim of graph layouts is to position nodes in a (usually) two-dimensional space to maximise some analytic and aesthetically pleasing function. There is a lot to which one could potentially pay attention. Quality measures might include:
- minimising the crossing number of edges/ties in the graph (planar graphs require no crossings)
- minimising the slope number of distinct edge slopes in the graph (where vertices are represented as points on a Euclidean plane)
- minimising the bend number in all edges in the graph (every graph has a right angle crossing (RAC) drawing with three bends per edge)
- minimising the total edge length
- minimising the maximum edge length
- minimising the edge length variance
- maximising the angular resolution or sharpest angle of edges meeting at a common vertex
- minimising the bounding box of the plot
- evening the aspect ratio of the plot
- displaying symmetry groups (subgraph automorphisms)
Graph layouts available in the {igraph},
{ggraph}, {graphlayouts}, and
{autograph} packages can be used in graphr().
These can be specified using the layout argument. In the
following sections, we review some of the most common types of
layouts.
Force-directed layouts
Force-directed layouts updates some initial placement of vertices through the operation of some system of metaphorically-physical forces. These might include attractive and repulsive forces.
(graphr(ison_southern_women, layout = "kk") + ggtitle("Kamada-Kawai") |
graphr(ison_southern_women, layout = "fr") + ggtitle("Fruchterman-Reingold") |
graphr(ison_southern_women, layout = "stress") + ggtitle("Stress Minimisation"))
The Kamada-Kawai (KK) method inserts a spring between all pairs of vertices that is the length of the graph distance between them. This means that edges with a large weight will be longer. KK offers a good layout for lattice-like networks, because it will try to space the network out evenly.
The Fruchterman-Reingold (FR) method uses an attractive force between directly connected vertices, and a repulsive force between all vertex pairs. The attractive force is proportional to the edge’s weight, thus edges with a large weight will be shorter. FR offers a good baseline for most types of networks.
The Stress Minimisation (stress) method is related to the KK
algorithm, but offers better runtime, quality, and stability and so is
generally preferred. Indeed, {manynet} uses it as the
default for most networks. It has the advantage of returning the same
layout each time it is run on the same network.
Other force-directed layouts available include:
- Simulated annealing (Davidson and Harel 1993):
"dh" - Graph embedder (Frick et al. 1995):
"gem" - Graphopt (Schmuhl):
"graphopt" - Distributed recursive graph layout (Martin et al. 2008):
"drl"
Layered layouts
Layered layouts arrange nodes into horizontal (or vertical) layers, positioning them so that they reduce crossings. These layouts are best suited for directed acyclic graphs or similar.
graphr(ison_southern_women, layout = "bipartite") + ggtitle("Bipartite")
graphr(ison_southern_women, layout = "hierarchy") + ggtitle("Hierarchy")
graphr(ison_southern_women, layout = "railway") + ggtitle("Railway")
Note that "hierarchy" and "railway" use a
different algorithm to {igraph}’s "bipartite",
and generally performs better, especially where there are multiple
layers. Whereas "hierarchy" tries to position nodes to
minimise overlaps, "railway" sequences the nodes in each
layer to a grid so that nodes are matched as far as possible. If you
want to flip the horizontal and vertical, you could flip the
coordinates, or use something like the following layout.
graphr(ison_southern_women, layout = "alluvial") + ggtitle("Alluvial")
Other layered layouts include:
- Tree:
"tree" - Dominance layouts
Circular layouts
Circular layouts arrange nodes around (potentially concentric) circles, such that crossings are minimised and adjacent nodes are located close together. In some cases, location or layer can be specified by attribute or mode.
graphr(ison_southern_women, layout = "concentric") + ggtitle("Concentric")
Other such layouts include:
- circular:
"circle" - sphere:
"sphere" - star:
"star" - arc or linear layouts:
"linear"
Spectral layouts
Spectral layouts arrange nodes according to the eigenvalues of the Laplacian matrix of a graph. These layouts tend to exaggerate clustering of like-nodes and the separation of less similar nodes in two-dimensional space.
graphr(ison_southern_women, layout = "eigen") + ggtitle("Eigenvector")
Somewhat similar are multidimensional scaling techniques, that visualise the similarity between nodes in terms of their proximity in a two-dimensional (or more) space.
graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")
Other such layouts include:
- Pivot multidimensional scaling:
"pmds"
Grid layouts
Grid layouts arrange nodes based on some Cartesian coordinates. These can be useful for making sure all nodes’ labels are visible, but horizontal and vertical lines can overlap, making it difficult to distinguish whether some nodes are tied or not.
graphr(ison_southern_women, layout = "grid") + ggtitle("Grid")
Other grid layouts include:
- orthogonal layouts for e.g. printed circuit boards
- grid snapping for other layouts
Multiple graphs
Arrangements
{autograph} uses the {patchwork} package
for arranging graphs together, e.g. side-by-side or above one another.
The syntax is quite straight forward and is used throughout these
vignettes/tutorials. Basically, you just use + to put
graphs side-by-side, and / to put them above one another.
Parentheses can be used to group graphs together.
graphr(fict_lotr) + graphr(ison_algebra)
graphr(fict_lotr) / graphr(ison_algebra)
Sets
graphr() is not the only graphing function included in
{autograph}. To graph sets of networks together,
graphs() makes sure that two or more networks are plotted
together. This might be a set of ego networks, subgraphs, or waves of a
longitudinal network.
graphs(to_subgraphs(fict_lotr, "Race"),
waves = c(1,2,3,4))
What is happening here is that to_subgraphs() is
creating a list of subgraphs and then graphs() is plotting
them together at once with the same set of aesthetic parameters.
Dynamics
grapht() is another alternative to
graphr(), this time rendering network changes as a gif.
While the grapht() function is not as flexible as
graphr(), it is very useful for visualising changes in
networks over time.
fict_lotr %>%
mutate_ties(year = sample(1:12, 66, replace = TRUE)) %>%
to_waves(attribute = "year", cumulative = TRUE) %>%
grapht()
More functionality will be added to this function in future releases.
Further flexibility
For more flexibility with visualizations, {autograph}
users are encouraged to use the excellent {ggraph} package.
{ggraph} is built upon the venerable {ggplot2}
package and works with tbl_graph and igraph
objects. As with {ggplot2}, {ggraph} users are
expected to build a particular plot from the ground up, adding explicit
layers to visualise the nodes and edges.
library(ggraph)
ggraph(fict_greys, layout = "fr") +
geom_edge_link(edge_colour = "dark grey",
arrow = arrow(angle = 45,
length = unit(2, "mm"),
type = "closed"),
end_cap = circle(3, "mm")) +
geom_node_point(size = 2.5, shape = 19, colour = "blue") +
geom_node_text(aes(label=name), family = "serif", size = 2.5) +
scale_edge_width(range = c(0.3,1.5)) +
theme_graph() +
theme(legend.position = "none")
As we can see in the code above, we can specify various aspects of the plot to tailor it to our network.
First, we can alter the layout of the network using
the layout = argument to create a clearer visualisation of
the ties between nodes. This is especially important for larger
networks, where nodes and ties are more easily obscured or
misrepresented. In {ggraph}, the default layout is the
“stress” layout. The “stress” layout is a safe choice because it is
deterministic and fits well with almost any graph, but it is also a good
idea to explore and try out other layouts on your data. More layouts can
be found in the {graphlayouts} and {igraph} R
packages. To use a layout from the {igraph} package, enter
only the last part of the layout algorithm name (eg.
layout = "mds" for “layout_with_mds”).
Second, using geom_node_point() which draws the nodes as
geometric shapes (circles, squares, or triangles), we can specify the
presentation of nodes in the network in terms of their
shape (shape=, choose from 1 to 21), size
(size=), or colour (colour=). We can
also use aes() to match to node attributes. To add labels,
use geom_node_text() or geom_node_label()
(draws labels within a box). The font (family=), font size
(size=), and colour (colour=) of the labels
can be specified.
Third, we can also specify the presentation of edges
in the network. To draw edges, we use geom_edge_link0() or
geom_edge_link(). Using the latter function makes it
possible to draw a straight line with a gradient. The following features
can be tailored either globally or matched to specific edge attributes
using aes():
colour:
edge_colour=width:
edge_width=linetype:
edge_linetype=opacity:
edge_alpha=
For directed graphs, arrows can be drawn using the
arrow= argument and the arrow() function from
{ggplot2}. The angle, length, arrowhead type, and padding
between the arrowhead and the node can also be specified.
For more see David Schoch’s excellent resources on this.
Plotting
While researchers will probably want to start with using
graphr() to visualise the network, {autograph}
also offers plot() methods for a number of different
network-related objects. These include of measures of centrality,
cohesion, and clustering, as well as goodness-of-fit plots for network
models from packages such as {RSiena} and
{MoNAn}. Usefully, all these plots use the same theming
system as graphr(), so that you can set a theme once and
have it apply to all your graphs and plots. Let’s try this now with a
few examples.
stocnet_theme("default")
plot(node_degree(fict_lotr)) +
plot(node_closeness(fict_lotr))
stocnet_theme("oxf")
plot(node_degree(fict_lotr)) +
plot(node_closeness(fict_lotr))
This is a very simple example, but the same principle applies to all
plots in {autograph}. One can set a theme once and have it
apply to all plots. You can also always add additional
{ggplot2} layers to any plot to further customise it. For
example, it is straightforward to add titles and labels to these plots,
but it is also possible to add trend lines, confidence intervals, and so
on. The user is encouraged to explore the {ggplot2} package
for more details.
Exporting plots to PDF
We can print the plots we have made to PDF by point-and-click by selecting ‘Save as PDF…’ from under the ‘Export’ dropdown menu in the plots panel tab of RStudio.
If you want to do this programmatically, say because you want to record how you have saved it so that you can e.g. make some changes to the parameters at some point, this is also not too difficult.
After running the (gg-based) plot you want to save, use the command
ggsave("my_filename.pdf") to save your plot as a PDF to
your working directory. If you want to save it somewhere else, you will
need to specify the file path (or change the working directory, but that
might be more cumbersome). If you want to save it as a different
filetype, replace .pdf with e.g. .png or
.jpeg. See ?ggsave for more.