What Do You Mean Test Coverage?!

Efficient Testing for Shiny Apps

Martin Frigaard (Atorus)

Introduction

A bit about me…

As a shiny developer
I want to focus on writing tests that matter 
So that I can spend less time testing code...

Agenda

Shiny testing

Unit tests
Integration tests
System tests
Test tools
- Fixtures
- Helpers

Development

Standard app development
Behavior-driven development
- Features
- Scenarios

Efficient Tests

What should I test?
How should I test it?
Code coverage

Shiny Testing

Testing your shiny app

Is much easier if your app is in a package

…but its not impossible if it’s not

Requires additional packages and/or functions beyond testthat

Benefits from having a well-designed test suite

Understand the relationship between R/ files and test- files

Unit Tests

Package(s): testthat

Focuses on specific units of code, ensuring each function or component behaves as intended.

Example: Verifying that a function correctly calculates a specific value based on the input.

Integration Tests

Package(s): shiny (testServer()) & testtthat

Confirms app/module server functions operate as expected

Example: Making sure that modules communicate and display the results from a specific function or calculation

System Tests

Package(s): shiny & shinytest2

Confirms all parts of the app behave correctly and provide a good user experience.

Example: Simulating a user’s experience with the application, selecting inputs, entering data, and ensuring the application responds correctly and displays the expected outputs.

Test Fixtures

Test fixtures are used to create repeatable test conditions

Good fixtures provide a consistent, well-defined test environment.

Fixtures are removed/destroyed after the test is executed.

This ensures any changes made during the test don’t persist or interfere with future tests.

Example: custom plot function

My app has the following utility function for creating a ggplot2 scatter plot:

scatter_plot <- function(df, x_var, y_var, col_var, alpha_var, size_var) {
    ggplot2::ggplot(data = df,
      ggplot2::aes(x = .data[[x_var]],
          y = .data[[y_var]],
          color = .data[[col_var]])) +
      ggplot2::geom_point(alpha = alpha_var, size = size_var)

}

The data masking from rlang (.data[[ ]]) means it can handle string arguments (i.e. input$x and input$y)

Example: test fixture

Test fixtures can be stored in tests/testthat/fixtures/

tests/
  ├── testthat/
  │   ├── fixtures/                                         
  │   │   ├── make-tidy_ggp2_movies.R 
  │   │   └── tidy_ggp2_movies.rds 
  │   ├── helper.R                                          
  │   └── test-scatter_plot.R                                     
  └── testthat.R

The make-tidy_ggp2_movies.R creates a ‘tidy’ version of ggplot2movies::movies.

Using test fixtures

Static data fixtures can be accessed with testthat::test_path():

test_that("tidy_ggp2_movies.rds works", code = {
tidy_ggp2_movies <- readRDS(test_path("fixtures", "tidy_ggp2_movies.rds"))
  app_graph <- scatter_plot(tidy_ggp2_movies,
                            x_var = 'rating',
                            y_var = 'budget',
                            col_var = 'mpaa',
                            alpha_var = 3/4,
                            size_var = 2.5)
expect_true(ggplot2::is.ggplot(app_graph))
})

If tidy_ggp2_movies.rds is used in a few tests, move make-tidy_ggp2_movies.R into data-raw/ and make tidy_ggp2_movies part of the package

ggplot2::is.ggplot() confirms a plot object has been built (doesn’t require a snapshot test)

Test helpers

Test helpers reduce repeated/duplicated test code

Objects that aren’t large enough to justify storing as static test fixtures can be created with helper functions

Helpers can be stored in tests/testthat/helper.R

tests/
  ├── testthat/
  │   ├── fixtures/
  │   │   ├── make-tidy_ggp2_movies.R
  │   │   └── tidy_ggp2_movies.rds
  │   ├── helper.R
  │   └── test-scatter_plot.R    
  └── testthat.R

Example: test helper

Assume I want a list of inputs to pass to the scatter_plot() in my test:

ggp2_scatter_inputs <- list(  
        x = "rating",
        y = "length",
        z = "mpaa",
        alpha = 0.75,
        size = 3,
        plot_title = "Enter plot title"
)

I could store these values in a function in tests/testthat/helper.R

var_inputs <- function() {
   list( x = "rating",
        y = "length",
        z = "mpaa",
        alpha = 0.75,
        size = 3,
        plot_title = "Enter plot title"
    )
}

Using test helpers

This removes duplicated code…

test_that("scatter_plot() works", code = {
  
tidy_ggp2_movies <- readRDS(test_path("fixtures", "tidy_ggp2_movies.rds"))
  
app_graph <- scatter_plot(tidy_ggp2_movies,
                          x_var = var_inputs()$x,
                          y_var = var_inputs()$y,
                          col_var = var_inputs()$z,
                          alpha_var = var_inputs()$alpha,
                          size_var = var_inputs()$size)

testthat::expect_true(ggplot2::is.ggplot(app_graph))
})

…but it’s unclear where var_inputs() comes from (or what it contains)

Tips on test helpers

If you have repeated code in your tests, consider the following questions below before creating a helper function:

Does the code help explain what behavior is being tested?
Would a helper make it harder to debug the test when it fails?

Consider a function like make_ggp2_inputs():

list(x = 'rating',
     y = 'length',
     z = 'mpaa',
     alpha = 0.75,
     size = 3,
     plot_title = 'Enter plot title'
     )

Development

Traditional development

Focuses on coding an applications functionalities

User specifications often go beyond what they ‘need’ and include solutions

Specifications with solutions bind developers to a particular implementation

Developers will then focus on the technical implementation and not finding the optimal solution

Can lead to delays evaluating the testability of features until late in the project

Behavior-driven development

Create applications that meet desired behaviors

Starts with understanding the user needs

Acknowledges specifications and requirements will change and evolve

Identifies and prioritizes features that deliver value

Uses scenarios for guiding how to test and build features

How does BDD work?

Users and developers work together to develop a clear vision of app’s value

Ongoing discussions between users and developers improves understanding of the problem by:

Uncovering hidden assumptions

Considering any potential risks

Building a shared appreciation for meeting user needs and achieving business goals

Features

Features are tangible functionalities that facilitate achieving a business goal

Who wants the feature?

What action does the feature perform?

What is the intended business value?

Gherkin:

Feature: < what is being built to deliver the proposed value >
  As a < user/stakeholder >
  I want to < perform some action >
  So that I can < achieve a business goal >

Describing features

Users and developers write stories to describe a feature’s expected outcome

Uses a first-person voice

States what users need

States why users need it

Feature: CDISC Variable Exploration Dashboard
  As a researcher or analyst 
  I want to explore variables in the Vital Signs Analysis Dataset (ADVS)
  So that I can analyze and derive insights from the vital signs data

Scenarios

Describe application behaviors in a plain, human-readable format

Given: establishes preconditions for the scenario.

When: specifies an action being tested.

Then: defines what outcomes to expect.

Gherkin:

  Scenario: < concrete example >
    Given < initial conditions >
    When  < action to test >
    Then  < expected outcome >

Writing scenarios

Set the stage

Feature: User login

  Scenario: Successful login with correct credentials
    Given the login page is loaded

Action and outcome

    When the user enters a valid username and password
    Then the user should be redirected to the dashboard

More outcomes

    And the user should see the landing page

Alternate story

  Scenario: Unsuccessful login with incorrect password
    Given the login page is loaded
    When the user enters a valid username but an incorrect password
    Then the user should see an error message stating 'Invalid password.'

BDD with Shiny

Efficient testing means writing scenarios that cover critical paths.

Ideally everything in your app is tested

In reality, decisions have to be made about what to test

If developers and users have collaboratively defined key features and expected behaviors, prioritize those scenarios

Adopt an ‘inspect and adapt’ posture

`testthat` BDD support

testthat has describe() and it() functions for features and scenarios:

testthat::describe(
    "Feature: Scatter plot data visualization
       As a film data analyst
       I want to explore movie review data from IMDB.com
       So that I can analyze relationships between movie reivew metrics"
       code = {
      testthat::it(
        "Scenario: Create scatter plot
           Given I have launched the movie review exploration app,
           When I view the scatter plot,
           Then I should see points representing values for a default
                set of continuous and categorical columns.", 
          code = {
      # test code
  })
})

What to test

Efficient testing

Unit tests
- All business logic (models, calculations, etc.) should have unit tests (no getting around this)

Integration tests
- shiny::testServer() tests should focus on ‘handshakes’ between modules and exporting/saving data

System/end-to-end tests
- Prioritize the feature/scenario that most directly observes the intended business goal with shinytest2()

Test coverage

Test coverage focuses on the execution paths (i.e., it assumes all paths are equally important)

Test coverage can’t check if our app meets user expectations

Test coverage is a valuable metric, but shouldn’t be the sole criterion for assessing your app

Remember

Today’s functionality is tomorrow’s regression

Apps need thorough and continuous testing in development so new features don’t negatively affect the existing functionalities

What Do You Mean Test Coverage?!

Introduction

Agenda

Shiny Testing

Testing your shiny app

Unit Tests

Integration Tests

System Tests

Test Fixtures

Example: custom plot function

Example: test fixture

Using test fixtures

Test helpers

Example: test helper

Using test helpers

Tips on test helpers

Development

Traditional development

Behavior-driven development

How does BDD work?

Features

Describing features

Scenarios

Writing scenarios

BDD with Shiny

testthat BDD support

What to test

Efficient testing

Test coverage

Remember

Thanks!

`testthat` BDD support