13 Cleveland dot plots
13.1 Description
Cleveland dot plots compare numbers with dots on a line and are more efficient than bar graphs. The graph lists the categories on the side and shows the data with dots along a line.
Typically, the graph contains two points representing the numerical value on the y
axis, differentiated by color. A line connecting the two points represents the difference between the two categorical levels (the width of the line is the size of the difference).
13.2 Set up
PACKAGES:
Install packages.
show/hide
install.packages("palmerpenguins")
library(palmerpenguins)
library(ggplot2)
DATA:
Remove missing values from sex
and flipper_length_mm
and group on sex
and island
to the calculate the median flipper length (med_flip_length_mm
).
show/hide
<- palmerpenguins::penguins |>
peng_clev_dots ::filter(!is.na(sex) & !is.na(flipper_length_mm)) |>
dplyr::group_by(sex, island) |>
dplyr::summarise(
dplyrmed_flip_length_mm = median(flipper_length_mm)
|>
) ::ungroup()
dplyr#> `summarise()` has grouped output by 'sex'. You
#> can override using the `.groups` argument.
glimpse(peng_clev_dots)
#> Rows: 6
#> Columns: 3
#> $ sex <fct> female, female, femal…
#> $ island <fct> Biscoe, Dream, Torger…
#> $ med_flip_length_mm <dbl> 210, 190, 189, 219, 1…
::::
13.3 Grammar
CODE:
Create labels with
labs()
Initialize the graph with
ggplot()
and providedata
Map the
med_flip_length_mm
to thex
axis, andisland
to they
axis, but wrapisland
inforcats::fct_rev()
.Add
geom_line()
, and mapisland
to thegroup
aesthetic. Set thelinewidth
to0.75
Add
geom_point()
and mapsex
tocolor
aesthetic. Set thesize
to2.25
show/hide
<- labs(
labs_clev_dots title = "Flipper Length Differences",
subtitle = "Male and female penguins",
x = "Median Flipper Length",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots,
ggp2_clev_dots mapping = aes(x = med_flip_length_mm,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.75) +
geom_point(aes(color = sex),
size = 2.25)
+
ggp2_clev_dots labs_clev_dots
GRAPH:
::::
13.4 More info
Cleveland dot plots are also helpful when comparing multiple differences on a common scale.
13.4.1 Common scale
SCALE:
Remove missing values from
sex
,bill_length_mm
andbill_depth_mm
, and group onsex
andisland
to the calculate the median bill length and median bill depth. These variables need to have ‘showtime-ready’ names because they’ll be used in our facets.After un-grouping the data, pivot the new columns into a long (tidy) format with
median_measure
containing the name of the variable, andmedian_value
containing the numbers.Finally, convert
median_measure
into a factor.
show/hide
<- palmerpenguins::penguins |>
peng_clev_dots2 ::filter(!is.na(sex) &
dplyr!is.na(bill_length_mm) &
!is.na(bill_depth_mm)) |>
::group_by(sex, island) |>
dplyr::summarise(
dplyr`Median Bill Length` = median(bill_length_mm),
`Median Bill Depth` = median(bill_depth_mm)) |>
::ungroup() |>
dplyr::pivot_longer(cols = starts_with("Med"),
tidyrnames_to = "median_measure",
values_to = "median_value") |>
::mutate(median_measure = factor(median_measure))
dplyr#> `summarise()` has grouped output by 'sex'. You
#> can override using the `.groups` argument.
glimpse(peng_clev_dots2)
#> Rows: 12
#> Columns: 4
#> $ sex <fct> female, female, female, f…
#> $ island <fct> Biscoe, Biscoe, Dream, Dr…
#> $ median_measure <fct> Median Bill Length, Media…
#> $ median_value <dbl> 44.90, 14.50, 42.50, 17.8…
13.4.2 Scales
scales
:
Re-create labels
Initialize the graph with
ggplot()
and providedata
Map the
median_value
to thex
axis, andisland
to they
axis, but wrapisland
inforcats::fct_rev()
.Add
geom_line()
, and mapisland
to thegroup
aesthetic. Set thelinewidth
to0.75
Add
geom_point()
and mapsex
tocolor
aesthetic. Set thesize
to2.25
Add
facet_wrap()
and facet bymedian_measure
, settingshrink
toTRUE
andscales
to"free_x"
Move the legend with
theme(legend.position = "top")
show/hide
<- labs(
labs_clev_dots2 title = "Penguin Measurements Differences",
subtitle = "Male and female penguins",
x = "Median Bill Length/Depth (mm)",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots2,
ggp2_clev_dots2 mapping = aes(x = median_value,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.55) +
geom_point(aes(color = sex),
size = 2) +
facet_wrap(. ~ median_measure,
shrink = TRUE, nrow = 2) +
theme(legend.position = "top")
+
ggp2_clev_dots2 labs_clev_dots2
CAUTION when using scales = "free_x"
: The graph below shows that the median bill length and depth is larger for male penguins on all three islands, but the magnitude of the differences should be interpreted with caution because the length of the lines can’t be directly compared!
show/hide
<- labs(
labs_clev_dots2 title = "Penguin Measurements Differences",
subtitle = "Male and female penguins",
x = "Median Bill Length/Depth (mm)",
y = "Island",
color = "Sex")
<- ggplot(data = peng_clev_dots2,
ggp2_clev_dots2_free_x mapping = aes(x = median_value,
y = fct_rev(island))) +
geom_line(aes(group = island),
linewidth = 0.55) +
geom_point(aes(color = sex),
size = 2) +
facet_wrap(. ~ median_measure,
shrink = TRUE, nrow = 2,
scales = "free_x") +
theme(legend.position = "top")
+
ggp2_clev_dots2_free_x labs_clev_dots2