25 Ensure

Published

2025-02-26

25.1 Help writing tests

I recommend using the ensure package to help write tests in your app-package. If you’re already using LLMs to help write code, ensure is a nice addition to your toolbox because of it’s focus on testing.

Launch app with the shinypak package:

launch('16.1_test-help')

“The ensurer is familiar with testthat 3e as well as tidy style, and incorporates context from the rest of your R package to write concise and relevant tests.” - ensure documentation

Follow the setup instructions to activate the addin (and I highly recommend creating a keyboard shortcut).

25.1.1 Help, not automation

The ensure package does a great job when writing basic unit tests, but as you’ll see, it doesn’t have a solid understanding of the specification or functional requirements of your application (only you know these). The test code created by ensure often needs some editing.

For example, the following test code was created for R/scatter_plot.R function:

test_that("scatter_plot creates a ggplot object", {
  
  p <- scatter_plot(mtcars, "mpg", "hp", "cyl", 0.7, 3)
  
  expect_s3_class(p, "gg")
  expect_equal(ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$x.range,
    range(mtcars$mpg))
  expect_equal(ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$y.range,
    range(mtcars$hp))
})

test_that("scatter_plot handles different alpha and size values", {
  
  p1 <- scatter_plot(mtcars, "mpg", "hp", "cyl", 0.5, 2)
  p2 <- scatter_plot(mtcars, "mpg", "hp", "cyl", 1, 4)
  
  expect_true(p1$layers[[1]]$aes_params$alpha < p2$layers[[1]]$aes_params$alpha)
  expect_true(p1$layers[[1]]$aes_params$size < p2$layers[[1]]$aes_params$size)
})

Based on the descriptions, these would test two important functional requirements of our application. However, two of these tests fail:

[ FAIL 2 | WARN 0 | SKIP 0 | PASS 3 ]

── Failure (test-scatter_plot.R:6:3): scatter_plot creates a ggplot object ─────
ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$x.range (`actual`) not equal to range(mtcars$mpg) (`expected`).

`actual` is NULL
`expected` is a double vector (10.4, 33.9)

── Failure (test-scatter_plot.R:8:3): scatter_plot creates a ggplot object ─────
ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$y.range (`actual`) not equal to range(mtcars$hp) (`expected`).

`actual` is NULL
`expected` is a double vector (52, 335)
[ FAIL 2 | WARN 0 | SKIP 0 | PASS 3 ]

This might lead us to believe our utility function is not behaving as expected. But when we dig into the method and values these tests are using, we discover it’s an issue with the test:

# build plot
p <- scatter_plot(mtcars, "mpg", "hp", "cyl", 0.7, 3)
# check x 
ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$x.range

NULL

# check y 
ggplot2::ggplot_build(p)$layout$panel_ranges[[1]]$y.range

NULL

The correct code for these tests would be the panel_params (not panel_ranges), but these aren’t going to work either, because ggplot2 automatically adjusts the axis limits based on the data and potential aesthetic mappings (which we can confirm with waldo::compare()):¹

waldo::compare(
  x = ggplot2::ggplot_build(p)$layout$panel_params[[1]]$x.range, 
  y = range(mtcars$mpg))

`old`:  9.2 35.1
`new`: 10.4 33.9

waldo::compare(
  x = ggplot2::ggplot_build(p)$layout$panel_params[[1]]$y.range, 
  y = range(mtcars$hp))

`old`: 37.9 349.1
`new`: 52.0 335.0

We can continue trying to find something produced by ggplot2::ggplot_build(), but this test confirms the “scatter_plot creates a ggplot object”, so why not just use ggplot2::is.ggplot():

test_that("scatter_plot creates a ggplot object", {
  p <- scatter_plot(mtcars, "mpg", "hp", "cyl", 0.7, 3)
  expect_true(ggplot2::is.ggplot(p))
})
test_that("scatter_plot handles different alpha and size values", {
  p1 <- scatter_plot(mtcars, "mpg", "hp", "cyl", 0.5, 2)
  p2 <- scatter_plot(mtcars, "mpg", "hp", "cyl", 1, 4)
  expect_true(p1$layers[[1]]$aes_params$alpha < p2$layers[[1]]$aes_params$alpha)
  expect_true(p1$layers[[1]]$aes_params$size < p2$layers[[1]]$aes_params$size)
})

[ FAIL 0 | WARN 0 | SKIP 0 | PASS 3 ]

I’ve provided this example because it illustrates some important limitations when using ensure (or any LLMs) to help write code: Don’t confuse volume with precision. LLMs are great at generating copius amounts of code, but have no way of checking if the code is functional or accurate. In this case, it arguably would’ve taken less time to do the research and write the correct test (instead of debugging the one created by ensure).

It’s also worth pointing out that if the original test code did pass, the test_that() description wouldn’t the expectation (these belong in a “scatter_plot x and y limits match true range of values” test).

Read more about numeric position scales in the Position scales and axes chapter of ggplot2, 3e ↩︎