Grouped violin plots

Graph info

Should I use this graph?


This graph requires:

✅ a categorical variable

✅ a numeric (continuous) variable

Description

A ‘violin plot’ is a variation of a density or ridgeline plot, where the distribution is plotted symmetrically, creating a two-sided, smoothed distribution.

Getting set up

PACKAGES:

Install packages.

Code
install.packages("palmerpenguins")
library(palmerpenguins)
library(ggplot2)

DATA:

Artwork by @allison_horst

Remove missing island from penguins

Code
peng_violin <- filter(penguins, !is.na(island))
glimpse(peng_violin)
Rows: 344
Columns: 8
$ species           <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
$ island            <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…
$ bill_length_mm    <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …
$ bill_depth_mm     <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…
$ body_mass_g       <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …
$ sex               <fct> male, female, female, NA, female, male, female, male…
$ year              <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…

CODE:

Create labels with labs()

Initialize the graph with ggplot() and provide data

Map island to the x, bill_length_mm to the y, and island to fill

Set alpha to 2/3

Remove the legend with show.legend = FALSE

Code
labs_grp_violin <- labs(
  title = "Adult foraging penguins", 
  subtitle = "Palmer Archipelago, Antarctica",
  x = "Island", fill = "Island",
  y = "Bill length (millimeters)")
ggp2_grp_violin <- ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(alpha = 2/3, 
      show.legend = FALSE) 
ggp2_grp_violin + 
  labs_grp_violin

GRAPH:

Violin plots can allow us to compare the ‘center’ and ‘spread’ of continuous variables across categorical groups.

Code
labs_grp_violin <- labs(
  title = "Adult foraging penguins", 
  subtitle = "Palmer Archipelago, Antarctica",
  x = "Island", fill = "Island",
  y = "Bill length (millimeters)")
ggp2_grp_violin <- ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(alpha = 2/3, 
      show.legend = FALSE) 
ggp2_grp_violin + 
  labs_grp_violin

More info

Change the shape of the line with linetype and linewidth.

draw_quantiles:

We can include lines for the 25th, 50th, and 75th quartiles using the draw_quantiles argument.

Code
ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(
      draw_quantiles = c(0.25, 0.5, 0.75),
      alpha = 1/2, 
      linewidth = 0.5, 
      show.legend = FALSE) 

kernel:

The kernel argument let’s us change the “kernel density estimate” used to create the violin shape. The possible kernel density estimates are "gaussian", "epanechnikov", "rectangular", "triangular", "biweight", "cosine", and "optcosine"

Code
ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(alpha = 1/2, 
      linewidth = 0.5, 
      kernel = "rectangular",
      show.legend = FALSE) 

bw:

We can directly adjust the shape of the violin with the bw argument, which is the standard deviation of the smoothing kernel. The trim argument trim(s) the tails of the violins to the range of the data.

Code
# bw of 0.5
grp_violin_bw0p5 <- ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(bw = 0.5, 
      alpha = 2/3, 
      trim = TRUE,
      show.legend = FALSE) 
grp_violin_bw0p5 + 
    labs_grp_violin + 
    labs(caption = "bw = 0.5")
# bw of 4.5
grp_violin_bw4p5 <- ggplot(data = peng_violin,
       aes(x = island, 
           y = bill_length_mm, 
        fill = island)) +
  geom_violin(bw = 4.5, 
      alpha = 2/3, 
      trim = TRUE,
      show.legend = FALSE) 
grp_violin_bw4p5 + 
    labs_grp_violin + 
    labs(caption = "bw = 4.5")