
33 Alluvial charts
33.1 Description
An alluvial graph displays the changes in composition or flow over time or across multiple categories.
We can build alluvial charts in ggplot2 with the ggalluvial package:.
See also: parallel sets
33.2 Set up
PACKAGES:
Install packages.
show/hide
# pak::pak("corybrunson/ggalluvial")
library(ggalluvial)
install.packages("palmerpenguins")
library(palmerpenguins) 
library(ggplot2)DATA:

Below we create a wide example of the penguins data (as peng_wide).
show/hide
peng_wide <- penguins |> 
  tidyr::drop_na() |> 
  dplyr::count(year, island, sex, species) |> 
  dplyr::mutate(year = factor(year)) |> 
  dplyr::rename(freq = n)
dplyr::glimpse(peng_wide)
#> Rows: 30
#> Columns: 5
#> $ year    <fct> 2007, 2007, 2007, 2007, 2007, 20…
#> $ island  <fct> Biscoe, Biscoe, Biscoe, Biscoe, …
#> $ sex     <fct> female, female, male, male, fema…
#> $ species <fct> Adelie, Gentoo, Adelie, Gentoo, …
#> $ freq    <int> 5, 16, 5, 17, 9, 13, 10, 13, 8, …33.3 Grammar
CODE:
Create labels with
labs()(withggtitle(),ylab(), andlabs())Add
scale_x_discrete()with thelimitsset to"Year","Island"and"Species", andexpandto0.1and0.07Add
geom_alluvium()withfillset to thesexvariable andgeom_stratum()Add
geom_text(), withstatset tostratumand label set toafter_stat(stratum)(insideaes())
show/hide
labs_alluvial <- ggtitle(label = "Palmer Penguins", 
    subtitle = "Stratified by year, island and species")
labs_alluvial_y <- ylab("Frequency") 
labs_alluvial_fill <- labs(fill = "Sex")
ggp2_alluvial_w <- ggplot(data = peng_wide,
  aes(axis1 = year, axis2 = island,
      axis3 = species, y = freq)) +
  scale_x_discrete(
    limits = c("Year", "Island", "Species"),
    expand = c(0.1, 0.07)) +
  geom_alluvium(aes(fill = sex)) +
  geom_stratum() +
  geom_text(stat = "stratum", 
    aes(label = after_stat(stratum)),
      size = 3)
ggp2_alluvial_w + 
  theme(legend.position = "bottom")
  labs_alluvial + 
  labs_alluvial_y + 
  labs_alluvial_fillGRAPH:
The ggalluvial functions can handle wide or long data.

33.4 More info
The ggalluvial package can also help reshape data with the to_lodes_form() function.
33.4.1 to_lodes_form()
Below we create peng_lodes from the penguins dataset using the to_lodes_form() from the ggalluvial package.
show/hide
peng_lodes <- penguins |> 
  dplyr::select(Year = year, Island = island, 
         Species = species, Sex = sex) |> 
  tidyr::drop_na() |> 
  dplyr::count(Year, Island, Species, Sex) |> 
  dplyr::mutate(Year = factor(Year)) |> 
  dplyr::rename(Freqency = n) |> 
  ggalluvial::to_lodes_form(key = "Measure", axes = 1:3) 
glimpse(peng_lodes)
#> Rows: 90
#> Columns: 5
#> $ Sex      <fct> female, male, female, male, fem…
#> $ Freqency <int> 5, 5, 16, 17, 9, 10, 13, 13, 8,…
#> $ alluvium <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, …
#> $ Measure  <fct> Year, Year, Year, Year, Year, Y…
#> $ stratum  <fct> 2007, 2007, 2007, 2007, 2007, 2…Create labels with labs()
Map
Measuretox,Frequencytoy,stratumtostratum,alluviumtoalluvium, andlabeltostratum.Add the
geom_alluvium()and mapSextofillAdd the
geom_stratum()and set thewidthto0.45Add
geom_text()and setstatto"stratum"
show/hide
labs_alluvial <- ggtitle(label = "Palmer Penguins", 
    subtitle = "Stratified by year, island and species")
ggp2_alluvial_lf <- ggplot(
    data = peng_lodes,
    aes(x = Measure,
        y = Freqency,
        stratum = stratum,
        alluvium = alluvium,
        label = stratum)) +
    ggalluvial::geom_alluvium(aes(fill = Sex)) +
    ggalluvial::geom_stratum(width = 0.45) +
    geom_text(stat = "stratum", size = 2.5)
ggp2_alluvial_lf +
    labs_alluvial +
    theme_ggp2g(base_size = 13)
33.4.2 geom_flow()
If you’d like to arrange the date or time variable across the x, you can use the ggalluvial::geom_flow() with ggalluvial::geom_stratum().
- First create 
peng_alluvial, a subset ofpalmerpenguins::penguins_rawwith all variables turned to factors. 
show/hide
peng_alluvial <- palmerpenguins::penguins_raw |> 
  janitor::clean_names() |> 
  dplyr::mutate(year = lubridate::year(date_egg),
         year = factor(year),
         individual_id = factor(individual_id),
         island = factor(island)) |> 
  dplyr::select(year, individual_id, island)
dplyr::glimpse(peng_alluvial)
#> Rows: 344
#> Columns: 3
#> $ year          <fct> 2007, 2007, 2007, 2007, 20…
#> $ individual_id <fct> N1A1, N1A2, N2A1, N2A2, N3…
#> $ island        <fct> Torgersen, Torgersen, Torg…Create labels with
labs()Initiate graph with
dataMap the
yearto thex,islandtostratum,individual_idtoalluvium,islandtofill, andislandtolabel.Add
scale_fill_brewer(), and set thetypeto"qual"and choose apaletteAdd the
geom_flow(), withstatset to"alluvium",lode.guidanceset to"frontback", andcolorto"#A9A9A9"
show/hide
# labels
labs_alluvial <- labs(
  title = "Penguin measurements across three years")
# add geom_flow() 
ggp2_alluvial_flow <- ggplot(data = peng_alluvial,
  mapping = aes(x = year, stratum = island, 
    alluvium = individual_id, 
    fill = island, label = island)) +
  scale_fill_brewer(type = "qual", palette = "Pastel2") +
  geom_flow(stat = "alluvium",
    lode.guidance = "frontback",
    color = "#A9A9A9")
ggp2_alluvial_flow
- Add 
ggalluvial::geom_stratum() 
show/hide
# add geom_stratum()
ggp2_alluvial_stratum <- ggp2_alluvial_flow +
  geom_stratum() 
ggp2_alluvial_stratum
33.4.3 legend.position
Move legend to bottom with theme(legend.position = "bottom")
show/hide
ggp2_alluvial_stratum + 
  labs_alluvial + 
  theme(legend.position = "bottom")