This vignette covers how to use pickler
functions with
testthat
’s
describe()
and it()
functions.
Make sure you have the latest version of testthat
installed.
if (!require(pak)) {
install.packages("pak",
repos = "http://cran.us.r-project.org"
)
}
pak::pkg_install("r-lib/testthat", upgrade = TRUE, ask = FALSE)
library(testthat)
Motivation
pickler
was motivated by the latest recommendation from
the testthat
authors on moving the test scope to
within the test (i.e., remove or reduce the need for code
outside the call to test_that()
).
To avoid code outside of
test_that()
:
Move file-scope logic to either narrower scope (just this test) or a broader scope (all files)
It’s ok to copy and paste: test code doesn’t have to be super dry. Obvious >> DRY - Package Development Masterclass, posit::conf(2023)
Adopting test fixtures,
helpers, and setup files will address the first item in the list
above, but pickler
is mostly focused on the second item:
making tests obvious and clear.
Behavior-driven development functions
The behavior-driven development (BDD) functions
are excellent additions to the suite of testing tools provided by
testthat
. They work like so:
describe(description = "verify that you implement the right things", code = {
it(description = "ensure you do the things right", code = {
expect_true(TRUE)
})
})
#> Test passed
The great thing about describe()
is that they can be
nested:
describe(description = "decribe the right way to implement thing", code = {
describe(description = "verify that you implemented the right thing", code = {
it(description = "ensure you do the thing right", code = {
expect_true(TRUE)
})
})
})
#> Test passed
Each it()
call is essentially identical to
test_that()
, but allows for a longer character string
description
.
What to test
Strive to test each behaviour in one and only one test. Then if that behaviour later changes you only need to update a single test - What to test, R Packages, 2ed
pickler
attempts to address the advice above by
providing a set of Gherkin-style
syntax functions that can be placed in the description
argument of decribe()
or it()
. The three
primary functions in pickler
are feature()
,
scenario()
, and background()
.
The following sections walk through an example with the
split_cols()
function. You can view the source code for
this function here,
but for now we’ll just look it’s arguments and the problem it
solves:
split_cols(
data,
col,
pattern = "[^[:alnum:]]+",
col_prefix = "col")
Given the input_data
data below:
input_data
#> id name
#> 1 1 First
#> 2 2 First, Last
#> 3 3 First, Middle, Last
split_cols()
will generate this:
#> id name col_1 col_2 col_3
#> 1 1 First First <NA> <NA>
#> 2 2 First, Last First Last <NA>
#> 3 3 First, Middle, Last First Middle Last
split_cols()
was written because tidyr::separate()
didn’t have an easy solution, and I wanted to reduce the number of
keystrokes/neurons/dependencies/etc to solve this problem.
tidyr::separate(data = input_data,
col = "name",
into = c("col_1", "col_2", "col_3"),
remove = FALSE,
extra = "merge",
fill = "right")
#> id name col_1 col_2 col_3
#> 1 1 First First <NA> <NA>
#> 2 2 First, Last First Last <NA> #> 3 3 First, Middle, Last First Middle Last
Features
Features are the units of functionality we’re trying to deliver. We
can adapt the language from testthat
’s BDD functions above
into a Gherkin-style
feature:
Feature: My function's feature
As a user of this function
I want to verify that I implemented the right thing
So that I can ensure I did the thing right
Features have the following keywords:
Feature: <title>
As a <person using the code>
I want <some action>
So that <desired behavior>
A feature should start with a mental model of the problem we’d like
to solve. The Gherkin-style
example for split_cols()
might look something like
this:
Feature: Split a single column into multiple columns using a pattern
As an analyst
I want to specify a column and a pattern
So that I can quickly generate multiple columns.
The goal of a feature is to clearly state the objective or
functionality we’re trying to achieve. The arguments in
feature()
match the Gherkin feature keywords:
feature(
title = ,
as_a = ,
i_want = ,
so_that =
)
This pattern is similar for all pickler
functions, but
I’ve converted the keywords to snake_case
so they align
with the tidyverse
naming conventions).
pickler
’s function arguments also serve as prompts to
fill out each keyword:
feature(
title = "Split a single column into multiple columns using a pattern",
as_a = "user of split_cols()",
i_want = "to specify a column and a pattern",
so_that = "I can quickly generate multiple columns."
)
You might be wondering why you’d bother passing this information into a function and not just include them in a language-agnostic fenced code block:
```
Feature: Split a single column into multiple columns using a pattern
As an analyst
I want to specify a column and a pattern
So that I can quickly generate multiple columns.
```
picker
’s functions are designed to work in R Markdown
and testthat
tests. For example, any code blocks
with a pickler
functions can set to eval=TRUE
,
echo=FALSE
, and comment=""
to render nicely
formatted Gherkin syntax when the file is knitted:
Feature: Split a single column into multiple columns using a pattern
As a user of split_cols()
I want to specify a column and a pattern
So that I can quickly generate multiple columns.
Scenarios
Examples are the cornerstone of behavior-driven development and testing. Captured as scenarios, these should illustrate a concrete example of how our code should behave. Scenarios are also great for communicating and documenting requirements, code, and tests with stakeholders and non-technical audiences:
“Examples play a primary role in BDD, simply because they’re an extremely effective way of communicating clear, precise, and unambiguous requirements. Specifications written in natural language are, as it turns out, a terribly poor way of communicating requirements, because there’s so much space for ambiguity, assumptions, and misunderstandings. Examples are a great way to overcome these limitations and clarify the requirements. Examples are also a great way to explore and expand your knowledge.” - BDD in Action, John Ferguson Smart
Consider the scenario()
below for
split_cols()
:
scenario(
title = "Split column with a default pattern",
given = "a dataframe with a specified column",
when = "I split the column using the default pattern",
then = "the column should be split into multiple columns"
)
We could also add function arguments (in square brackets), but this is optional.
scenario(
title = "Split column with a default pattern",
given = "a dataframe [data] with a specified column [col]",
when = "I split the [col] column using the default [[^[:alnum:]]+]",
then = "the [col] column should be split into multiple columns"
)
When rendered, this scenario would look like so:
Scenario: Split column with a default pattern
Given a dataframe [data] with a specified column [col]
When I split the [col] column using the default [[^[:alnum:]]+]
Then the [col] column should be split into multiple columns
And
Scenarios can also include additional and
statements
after the then
keyword to test multiple behaviors:
scenario(
title = "Split column with a default pattern",
given = "a dataframe [data] with a specified column [col]",
when = "I split the [col] column using the default [[^[:alnum:]]+]",
then = "the [col] column should be split into multiple columns",
and = "the new columns should be named with the default prefix [cols_]"
)
Gherkin scenarios occasionally include And
keywords
directly following a Given
statement, but in
pickler
this information is placed in the
background()
:
Background
background()
can be used to provide context for any
preexisting conditions (or the state of the world) we’re contending
with. background()
can also be used to reduce any
repetitive information in scenarios. For example, consider the two
scenarios below:
Scenario: Split column with 'default' pattern
Given a dataframe with text data
And a specified column "<column_name>"
When I split the column with the default pattern
And a column prefix "<col_prefix>"
Then the column should be split into multiple columns
And the resulting dataframe should have the original data with added split columns
And the new columns should have names with the provided prefix and an index
Scenario: Split column with 'custom' pattern
Given a dataframe with text data
And a specified column "<column_name>"
When I split the column with a custom pattern "<pattern>"
And a column prefix "<col_prefix>"
Then the column should be split according to the custom pattern
And the resulting dataframe should have the original data with added split columns
And the new columns should have names with the provided prefix and an index
These scenarios have a lot of duplicated lines–specifically, these three:
Given a dataframe with text data
And a specified column "<column_name>"
And a column prefix "<col_prefix>"
We can place these in a background()
to remove
repetition and ‘set the stage’ for any future scenarios:
background(
title = "Input dataframe with text data",
given = "a dataframe [data] with text columns",
and = list(
"a column prefix [cols_]",
"a specified column [col]"))
The nice thing about storing additional context in
background()
is that you’ll only have to update this
information in a single location.
When rendered, the background()
looks like this:
Background: Input dataframe with text data
Given a dataframe [data] with text columns
And a column prefix [cols_]
And a specified column [col]
Tests
BDD tests are written to communicate the context, intended behavior,
and a series of use cases for how our code should be used.
pickler
functions help facilitate this because they can be
placed in nested calls to describe()
and
it()
.
Below is an example feature()
,
background()
, and scenario()
for
split_cols()
:
describe(
feature(
title = "Split a single column into multiple columns using a pattern",
as_a = "user of split_cols()",
i_want = "to specify a column and a pattern",
so_that = "I can quickly generate multiple columns."
), code = {
describe(
background(
title = "Input dataframe with text data",
given = "a dataframe [data] with text columns",
and = list(
"a column prefix [cols_]",
"a specified column [col]")), code = {
it(
scenario(
title = "Split column with a default pattern",
given = "a dataframe [data] with a specified column [col]",
when = "I split the [col] column using the default [[^[:alnum:]]+]",
then = "the [col] column should be split into multiple columns"
), code = {
expect_true(TRUE)
})
})
})
#> Test passed
BDD tests take longer to write, but the tradeoff is obvious: any losses in brevity are gained in clarity. The emphasis on using plain language makes it easier for stakeholders and collaborators to understand how our code works (and how it’s being tested).
pickler
functions can also be placed in
desc
argument of a the test_that()
test, but
sometimes a multiline descriptions can
cause errors.:
test_that(
feature(
title = "My function's feature",
as_a = "user of this function",
i_want = "to verify that I implemented the right thing",
so_that = "to ensure I did the thing right"
),
code = {
expect_true(TRUE)
}
)
#> Test passed