11  Specifications

Specifications, Requirements and Features
  • User Specifications: Describe what the end-users expect the application to accomplish

  • Features: User-focused descriptions of the high-level capabilities or characteristics of an application, and often represent a bundle of smaller functionalities

  • Functional Requirements: The precise, measurable, and testable ‘functions’ an application should perform

  • BDD Features and Scenarios: Descriptions of the expected behavior of the application features using a set of special keywords (Feature, Scenario, Given, When, Then, And, But).

    • describe(): ‘specifies a larger component or function and contains a set of specifications’. Include feature descriptions and any relevant background information the describe() blocks

    • it(): contains the test code or expectations. These are use to test each functional requirement (or Then statements from scenarios).

  • Traceability matrix: A table that ‘traces’ the user specifications to features and functional requirements (and the tests they give rise to) to verify that the application has been developed correctly.

This chapter focuses on what to test–or specifically, how to figure out what to test. This process usually involves converting a list of user needs into testable requirements. The following chapters in this section have examples of tests, but don’t go into depth on how to write tests. Excellent resources exist for writing unit tests in R packages 1 and testing shiny apps. 2 3 The goal of this chapter is to illustrate the connections between the user’s needs, the code below R/, and developing tests.

I’ve created the shinypak R package In an effort to make each section accessible and easy to follow:

Install shinypak using pak (or remotes):

# install.packages('pak')

Review the chapters in each section:

list_apps(regex = '^11')
## # A tibble: 1 × 2
##   branch         last_updated       
##   <chr>          <dttm>             
## 1 11_tests-specs 2024-07-01 22:42:51

Launch an app:

launch(app = "11_tests-specs")

(If you’re using devtools, you won’t have to worry about installing testthat and covr)

11.1 Application requirements

Information about the various tasks and activities an application is expected to perform in typically stored in some kind of software requirements specification (SRS) document.4 The SRS can include things like distinct design characteristics, budgetary, technological, or timeline restrictions, etc. This document breaks down an application’s intended purpose (i.e., the problem it’s designed to solve) into three general areas: user specifications, features, and functional requirements.

11.1.1 User Specifications

Launch app with the shinypak package:


The user specifications are the goals and objectives stakeholders want to achieve with the application. They use terms like ‘deliver value’ and ‘provide insight’ and provide the basis for deriving the application’s features. 5

11.1.2 Features

Features translate the user expectations into an application capabilities. Generally speaking, features capture the tasks and activities user should “be able to” accomplish with the application (i.e., explore data with a graph).

11.1.3 Functional Requirements

Functional requirements are written for developers and provide the technical details on each feature. A single feature often gives rise to multiple functional requirements (these are where users needs come into direct contact with code)6

In summary

The areas above help direct the development process, albeit from slightly different perspectives.

  1. The user specifications capture the needs and expectations of the end-user.

  2. The features describe the high-level capabilities of the application.

  3. Functional requirements are the testable, specific actions, inputs, and outputs.

11.2 Application developemnt

The Shiny application development process follows something like the figure below:

General application development process

The figure above is an oversimplification, but it highlights a common separation (or ‘hand-off’) between users/stakeholders and developers. In the sections below, we’ll look at two common development processes: test-driven and behavior-driven development.

Test-driven development

If moviesApp was built using test-driven development (TDD), the process might look something like this:

  1. Gather user needs and translate into application features:
    1. Document the application’s capabilities for exploring movie review variables from IMDB and Rotten Tomatoes.
    2. Include feature descriptions for displaying continuous variables (i.e., ‘critics score’ and ‘audience score’) categorical variables (i.e., ‘MPAA’ ), graph visual attributes (size, color, opacity), and an optional plot title.
  2. Write Tests:
    1. Write tests to ensure the graph displays relationships between a set of continuous and categorical variables when the app launches.
  3. Run Tests:
    1. Before writing any code, these tests will fail.
  4. Develop Features:
    1. Write UI, server, module, and utility functions for user inputs and graph outputs.
  5. Rerun Tests:
    1. If the graph has been correctly implemented in the application, the tests should pass.
  6. Write more Tests:
    1. Add more tests for additional functionalities (e.g., an option to remove missing values from graph).

Starting with tests and writing just enough code to get them to pass often results in developing less (but better) code. The drawback to this approach is a strict focus on the function being tested and not the overall objective of the application.

Behavior-driven development

In behavior-driven development (BDD) (or behavior-driven testing), users and developers work together to understand, define and express application behaviors in non-technical language, 7

Using conversation and examples to specify how you expect a system to behave is a core part of BDD” - BDD in Action, 2ed

Placing an emphasis on writing human-readable expectations for the application’s behaviors makes it easier to develop tests that can focus on verifying each user need exists (and is functioning properly). In BDD, the application’s expected capabilities are captured in Features and illustrated with concrete examples, or Scenarios.


In BDD, a Feature describes an implemented behavior or capability in the application, from a user’s perspective. Typically, these are written in the Gherkin format using specific keywords:8

  1. As a ...
  2. I want ...
  3. So that ...

Below is an example Gherkin Feature for the graph in launch_app():

Feature: Visualization
    As a user
    I want to see the changes in the plot
    So that I can visualize the impact of my customizations

As you can see, the feature uses plain language and the wording is user-centric, so it remains accessible to both developers and users (or other non-technical stakeholders).


A Gherkin Scenario provides a concrete example of how the Feature works and has the following general format:

  1. Given ...
  2. When ...
  3. Then ...

An example Scenario for launch_app() might be:

  Scenario: Viewing the Data Visualization
    Given I have launched the application
    And it contains movie review data from IMDB and Rotten Tomatoes
    And the data contains variables like 'Critics Score' and 'MPAA'
    When I interact with the controls in the sidebar panel
    Then the graph should update with the selected options


Instead of repeating any pre-conditions in each Scenario (i.e., the steps contained in the “Given” and first “And” statement), we can establish the context with a Background:

  Background: Launching the application
    Given I have launched the application
    And it loads with movie review data from IMDB and Rotten Tomatoes
  Scenario: Viewing the Data Visualization
    Given the data contains variables like 'Critics Score' and 'MPAA'
    When I interact with the controls in the sidebar panel
    Then the graph should update with the selected options

Adopting the Gherkin format (or something similar) provides a common language to express an application’s behavior:

  1. As developers, we can work with users and shareholders to write specifications that describe the expected behavior of each Feature

  2. When developing tests, we can group the tests by their Feature and Scenarios

  3. Each test can execute a step (i.e., the Then statements).

In the next section we’ll cover how to map test code for each Scenario step with testthat.

11.3 BDD and testthat

testthat’s BDD functions (describe() and it()) allow us add Gherkin-style features and scenarios to our test files, ensuring the application remains user-centric while meeting the technical specifications.9

11.3.1 describe() a feature

We can use the language from our Feature, Background, and Scenario to in the description in the argument of describe():

  description = "Feature: Visualization
                   As a user
                   I want to see the changes in the plot
                   So that I can visualize the impact of my customizations",
  code = {

We can also nest describe() calls, which means we can include the Background (or other relevant information):

  "Feature: Visualization
      As a user
      I want to see the changes in the graph
      So that I can visualize the impact of my customizations.", 
  code = {
    "Background: Launching the application
        Given I have launched the application
        And it loads with movie review data from IMDB and Rotten Tomatoes", 
    code = {   
BDD Feature (title and description)
Background (preexisting conditions before each scenario)

11.3.2 Confirm it() with a test

Inside describe(), we can include multiple it() blocks which “functions as a test and is evaluated in its own environment.

In the example below, we’ll use an it() block to test the example scenario from above:10

  "Feature: Visualization
      As a user
      I want to see the changes in the graph
      So that I can visualize the impact of my customizations.", 
  code = {
    "Background: Launching the application
        Given I have launched the application
        And it loads with movie review data from IMDB and Rotten Tomatoes",
      code = {
      "Scenario: Viewing the Data Visualization
         Given the data contains variables like 'Critics Score' and 'MPAA'
         When I interact with the controls in the sidebar panel
         Then the graph should update with the selected options",
        code = {
          # test code
BDD Feature (title and description)
Background (preexisting conditions before each scenario)
Scenario (a concrete examples that illustrates a feature)
Test code

In the scenario above, Then contains the information required for the testthat expectation. This could be expect_snapshot_file() or vdiffr::expect_doppelganger()–whichever makes sense from the user’s perspective.

These are generic examples, but hopefully the tests in the upcoming chapters convey how helpful and expressive BDD functions can be (or they inspire you to properly implement what I’m attempting to do in your own app-packages).11

11.4 Traceability Matrix

After translating the user needs into functional requirements, we can identify what needs to be tested by building a look-up table (i.e., a matrix).

I like to store early drafts of the requirements and traceability matrix in a vignette:12


Adding our first vignette to the vignettes/ folder does the following:

  1. Adds the knitr and rmarkdown packages to the Suggests field in DESCRIPTION13
  1. Adds knitr to the VignetteBuilder field14
VignetteBuilder: knitr
  1. Adds inst/doc to .gitignore and *.html, *.R to vignettes/.gitignore

The first column in the traceability matrix contains the user specifications, which we can ‘trace’ over to the functional requirements and their relevant tests.15

Traceability Matrix
Specifications Features Requirements Test
S1: The application should source movie review data from platforms like IMDB or Rotten Tomatoes


Feature: Movie Review Dataset Variables

As a user

I want to have a dataset with variables from IMDB and Rotten Tomatoes

In order to provide comprehensive movie reviews


Given the application has access to IMDB and Rotten Tomatoes APIs

Scenario: Movie Review Continuous and Categorical Variables

When the application loads from IMDB and Rotten Tomatoes movie review data

Then the dataset should include a continuous critic ratings variable

And the dataset should include a continuous audience ratings variable

And the dataset should include a categorical mpaa ratings variable

And the dataset should include a categorical genres variable

Building a traceability matrix ensures:

  1. All user specifications have accompanying application features.

  2. Each feature has been broken down into precise, measurable, and testable functional requirements.

  3. Tests have been written for each functional requirement.


Launch app with the shinypak package:


Understanding the relationship between user specifications, features, and functional requirements gives us the information we need to build applications that satisfy the technical standards while addressing user needs. Documenting requirements in Gherkin-style features and scenarios allows us to capture the application’s behavior without giving details on how the functionality is implemented.

In the next chapter, we’re going to cover various tools to improve the tests in your app-package. The overarching goal of these tools is to reduce code executed outside of your tests (i.e., placed above the call to test_that() or it()).

Recap: Test Specigications


  • Scoping tests: user specifications outline software goals and needs, and the functional requirements provide the technical details to achieve them.

    • User specifications: descriptions of what a user expects the application to do (i.e., the user ‘wish list’ of features they want in the application).

    • Features: detailed list of the main capabilities and functions the application needs to offer to users.

    • Functional requierments: testable, specific step-by-step instructions for ensuring the application does what it’s supposed to do.

    • Traceability matrix: tracking tool for connecting the users ‘wish list’ (i.e, specifications) to what’s being tested.

Please open an issue on GitHub

  1. Unit tests are covered extensively in R Packages, 2ed and the testthat documentation↩︎

  2. Mastering shiny dedicates an entire Chapter to Testing.) shinytest2 also has excellent documentation (and videos), and I highly recommend reading through those resources.↩︎

  3. I will cover a few tips and tricks I’ve learned for testing module server functions with testServer() because they’re not in the documentation.↩︎

  4. Read more about what goes in the Software Requirements Specification↩︎

  5. User Specifications are sometimes referred to as “user stories,” “use cases,” or “general requirements”↩︎

  6. ‘Features’ and ‘functional requirements’ are sometimes used interchangeably, but they refer to different aspects of the application. Features are high-level capabilities an application should have, and often contain a collection of smaller functionalities (broken down into the specific functional requirements).↩︎

  7. Read more about behavior-driven development↩︎

  8. Gherkin is the domain-specific language format used for expressing software behaviors. Tools like Cucumber or SpecFlow maps and executes the Gherkin descriptions against the code to generate a pass/fail report status for each requirement.↩︎

  9. Read more about describe() and it() in the testthat documentation. and in the appendix.↩︎

  10. Each it() block contains the expectations (or what you would traditionally include in test_that()).↩︎

  11. For an excellent description on the relationships between behavior-driven development, test-driven development, and domain-driven design, I highly recommend BDD in Action, 2ed by John Ferguson Smart and Jan Molack.↩︎

  12. Storing the traceability matrix in a vignette is great for developers, but using an issue-tracking system with version control is also a good idea, like GitHub Projects or Azure DevOps.↩︎

  13. We briefly covered the Suggests field in Dependencies, but in this case it specifically applies to “packages that are not necessarily needed. This includes packages used only in examples, tests or vignettes…” - Writing R Extensions, Package Dependencies↩︎

  14. The documentation on VignetteBuilder gives a great example of why knitr and rmarkdown belong in Suggests and not Imports.↩︎

  15. When building tables in vignettes, I highly recommend using the Visual Markdown mode.↩︎