1 Bar graphs
1.1 Description
A bar graph compares data in different categories using rectangular bars that vary in length or height. They can be vertical or horizontal, with the vertical axis showing the quantities being measured and the horizontal axis listing the categories. Bar graphs often include a legend explaining the colors or patterns used when comparing multiple variables.
In ggplot2
, bar graphs can be built using geom_bar()
(see also: geom_col()
).
1.2 Set up
PACKAGES:
Install packages.
show/hide
install.packages("palmerpenguins")
library(palmerpenguins)
library(ggplot2)
DATA:
Filter the missing values from species
in the palmerpenguins::penguins
data and store it in penguins_bar
.
show/hide
<- palmerpenguins::penguins |>
penguins_bar ::filter(!is.na(species))
dplyrglimpse(penguins_bar)
#> Rows: 344
#> Columns: 8
#> $ species <fct> Adelie, Adelie, Adelie…
#> $ island <fct> Torgersen, Torgersen, …
#> $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, …
#> $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, …
#> $ flipper_length_mm <int> 181, 186, 195, NA, 193…
#> $ body_mass_g <int> 3750, 3800, 3250, NA, …
#> $ sex <fct> male, female, female, …
#> $ year <int> 2007, 2007, 2007, 2007…
1.3 Grammar
CODE:
Create labels with labs()
Initialize the graph with ggplot()
and provide data
Map species
to the x
axis
Map species
to the fill
aesthetic inside the aes()
of geom_bar()
Remove the legend with show.legend = FALSE
show/hide
<- labs(
labs_bar title = "Adult foraging penguins",
subtitle = "Distribution of flipper length",
x = "Species", y = "Count",
fill = "Species")
<- ggplot(data = penguins_bar,
ggp2_bar aes(x = species)) +
geom_bar(aes(fill = species),
show.legend = FALSE)
+
ggp2_bar labs_bar
GRAPH:
1.4 More info
- The connection between statistical transformations and geoms is an important principle for building graphs (and mastering the grammar) with
ggplot2
- Below we cover why
geom_bar(stat = "count")
produces the same result asstat_count(geom = "bar")
- Below we cover why
“every geom has a default stat, and every stat a default geom.” - ggplot2
book
- Bar graphs can also be created with
geom_col()
1.4.1 stats and geoms
stat_count():
The default stat
argument in geom_bar()
is set to "count"
, which ‘counts the number of cases at each x
position’, so it’s ideal for categorical variables (or factors).
The stat_count()
function can also be used to create bar graphs using the geom
argument.
The link between geom_
geom_name(stat = "
stat_name")
and stat_
stat_name(geom = "
geom_name")
is shown below:
show/hide
<- ggplot(data = penguins_bar,
ggp2_geom_bar aes(x = species)) +
geom_bar(aes(fill = species),
stat = "count") +
labs(title = "geom_bar(stat = 'count')")
ggp2_geom_bar
<- ggplot(data = penguins_bar,
ggp2_stat_count aes(x = species)) +
stat_count(aes(fill = species),
geom = "bar") +
labs(title = "stat_count(geom = 'bar')")
ggp2_stat_count
1.4.2 geom_col()
geom_col():
To create a bar graph with geom_col()
, the count
variable needs to be computed before being mapped into the graph y
aesthetic.
show/hide
|>
penguins_bar # create column of counts
::count(species, name = "count") |>
dplyr# map into x and y
ggplot(mapping = aes(x = species, y = count)) +
geom_col(aes(fill = species),
show.legend = FALSE) +
+
labs_bar labs(caption = "*geom_col()")
# compare to geom_bar()
+
ggp2_bar labs_bar