Cleveland dot plot
Description
A Cleveland dot plot displays differences in a numerical variable for different levels of a categorical variable.
Typically, the graph contains two points representing the numerical value on the y
axis, differentiated by color. A line connecting the two points represents the difference between the two categorical levels (the width of the line is the size of the difference).
Getting set up
PACKAGES:
Install packages.
Code
install.packages("palmerpenguins")
library(palmerpenguins)
library(ggplot2)
DATA:
Remove missing values from sex
and flipper_length_mm
and group on sex
and island
to the calculate the median flipper length (med_flip_length_mm
).
Code
<- palmerpenguins::penguins |>
peng_clev_dots ::filter(!is.na(sex) & !is.na(flipper_length_mm)) |>
dplyr::group_by(sex, island) |>
dplyr::summarise(
dplyrmed_flip_length_mm = median(flipper_length_mm)
|>
) ::ungroup()
dplyrglimpse(peng_clev_dots)
Rows: 6
Columns: 3
$ sex <fct> female, female, female, male, male, male
$ island <fct> Biscoe, Dream, Torgersen, Biscoe, Dream, Torgersen
$ med_flip_length_mm <dbl> 210, 190, 189, 219, 196, 195
The grammar
CODE:
Create labels with labs()
Initialize the graph with ggplot()
and provide data
Map the med_flip_length_mm
to the x
axis, and island
to the y
axis, but wrap island
in forcats::fct_rev()
.
Add geom_line()
, and map island
to the group
aesthetic. Set the linewidth
to 0.75
Add geom_point()
and map sex
to color
aesthetic. Set the size
to 2.25
Code
<- labs(
labs_clev_dots title = "Flipper Length Differences",
subtitle = "Male and female penguins",
x = "Median Flipper Length",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots,
ggp2_clev_dots mapping = aes(x = med_flip_length_mm,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.75) +
geom_point(aes(color = sex),
size = 2.25)
+
ggp2_clev_dots labs_clev_dots
GRAPH:
More info
Cleveland dot plots are also helpful when comparing multiple differences on a common scale.
SCALE:
Remove missing values from sex
, bill_length_mm
and bill_depth_mm
, and group on sex
and island
to the calculate the median bill length and median bill depth. These variables need to have ‘showtime-ready’ names because they’ll be used in our facets. After un-grouping the data, pivot the new columns into a long (tidy) format with median_measure
containing the name of the variable, and median_value
containing the numbers.
Finally, convert median_measure
into a factor.
Code
<- palmerpenguins::penguins |>
peng_clev_dots2 ::filter(!is.na(sex) &
dplyr!is.na(bill_length_mm) &
!is.na(bill_depth_mm)) |>
::group_by(sex, island) |>
dplyr::summarise(
dplyr`Median Bill Length` = median(bill_length_mm),
`Median Bill Depth` = median(bill_depth_mm)) |>
::ungroup() |>
dplyr::pivot_longer(cols = starts_with("Med"),
tidyrnames_to = "median_measure",
values_to = "median_value") |>
::mutate(median_measure = factor(median_measure))
dplyrglimpse(peng_clev_dots2)
Rows: 12
Columns: 4
$ sex <fct> female, female, female, female, female, female, male, m…
$ island <fct> Biscoe, Biscoe, Dream, Dream, Torgersen, Torgersen, Bis…
$ median_measure <fct> Median Bill Length, Median Bill Depth, Median Bill Leng…
$ median_value <dbl> 44.90, 14.50, 42.50, 17.80, 37.60, 17.45, 48.50, 16.00,…
scales
:
Re-create labels
Initialize the graph with ggplot()
and provide data
Map the median_value
to the x
axis, and island
to the y
axis, but wrap island
in forcats::fct_rev()
.
Add geom_line()
, and map island
to the group
aesthetic. Set the linewidth
to 0.75
Add geom_point()
and map sex
to color
aesthetic. Set the size
to 2.25
Add facet_wrap()
and facet by median_measure
, setting shrink
to TRUE
and scales
to "free_x"
Move the legend with theme(legend.position = "top")
Code
<- labs(
labs_clev_dots2 title = "Penguin Measurements Differences",
subtitle = "Male and female penguins",
x = "Median Bill Length/Depth (mm)",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots2,
ggp2_clev_dots2 mapping = aes(x = median_value,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.55) +
geom_point(aes(color = sex),
size = 2) +
facet_wrap(. ~ median_measure,
shrink = TRUE, nrow = 2) +
theme(legend.position = "top")
+
ggp2_clev_dots2 labs_clev_dots2
CAUTION when using scales = "free_x"
: The graph below shows that the median bill length and depth is larger for male penguins on all three islands, but the magnitude of the differences should be interpreted with caution because the length of the lines can’t be directly compared!
Code
<- labs(
labs_clev_dots2 title = "Penguin Measurements Differences",
subtitle = "Male and female penguins",
x = "Median Bill Length/Depth (mm)",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots2,
ggp2_clev_dots2_free_x mapping = aes(x = median_value,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.55) +
geom_point(aes(color = sex),
size = 2) +
facet_wrap(. ~ median_measure,
shrink = TRUE, nrow = 2,
scales = "free_x") +
theme(legend.position = "top")
+
ggp2_clev_dots2_free_x labs_clev_dots2