This section covers the gt
package for displaying data in tables in your R Markdown reports.
Slides: this is an extra lesson, so there aren’t currently slides available (yet!)
RStudio.Cloud: this is an extra lesson, so there isn’t an RStudio.Cloud project (yet!)
We’ll be covering the gt
and tidyverse
packages for displaying tables in R Markdown.
We will also import some actual stock data using quantmod
and xts
packages. This code was built using the example in Reproducible Finance with R by Jonathan K. Regenstein, Jr.
library(gt)
library(tidyverse)
library(glue)
library(readxl)
library(highcharter)
library(tidyquant)
library(timetk)
library(tibbletime)
library(quantmod)
library(PerformanceAnalytics)
library(scales)
quantmod
The code below collects stock data for five biotech companies:
Each of these companies are exchange-traded funds, or ETFs, and we can download their data using their unique ‘ticker’ symbols.
“ETFs are bought and sold throughout the day on stock exchanges.”
# put stock symbols in a vector
biotech_symbols <- c("BMRN", "NVS", "GSK", "AZN", "LLY")
# collect yahoo sales data
biotech_prices <- quantmod::getSymbols(biotech_symbols,
# returns an object with the opening price, closing price,
# adjusted price, daily high, daily low and daily volume
# for each ticker.
src = 'yahoo',
from = "2019-12-31", to = "2020-12-31",
auto.assign = TRUE,
warnings = FALSE) %>%
# Ad() and get() is used to isolate the adjusted price
purrr::map(~Ad(get(.))) %>%
# merge() looks for the date index shared by our objects and
# uses that index to align the data.
purrr::reduce(merge) %>%
# use colnames<- to rename the columns according to the
# symbols object.
`colnames<-`(biotech_symbols)
# check structure of biotech_prices
str(biotech_prices)
## An 'xts' object on 2019-12-31/2020-12-29 containing:
## Data: num [1:252, 1:5] 84.6 84 84 84.6 84.2 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr [1:5] "BMRN" "NVS" "GSK" "AZN" ...
## Indexed by objects of class: [Date] TZ: UTC
## xts Attributes:
## List of 2
## $ src : chr "yahoo"
## $ updated: POSIXct[1:1], format: "2020-12-30 00:51:27"
We can see this creates an xts
or, extensible time-series object. A noteworthy difference between an xts
and tibble
dataset is that all xts
objects have a date index
.
The code below shows the attributes and structure of the first six items in biotech_prices
:
unclass(head(biotech_prices))
## BMRN NVS GSK AZN LLY
## [1,] 84.55 91.37416 44.72367 48.47673 128.8463
## [2,] 84.04 91.62505 44.65704 48.99202 129.6110
## [3,] 84.00 91.47066 44.23826 48.70035 129.1796
## [4,] 84.59 92.08825 44.25729 48.49617 129.6600
## [5,] 84.21 91.17152 43.98128 48.68090 129.9051
## [6,] 84.71 91.17152 44.17164 48.56423 131.0815
## attr(,"src")
## [1] "yahoo"
## attr(,"updated")
## [1] "2020-12-30 00:51:27 MST"
## attr(,"index")
## [1] 1577750400 1577923200 1578009600 1578268800 1578355200 1578441600
## attr(,"index")attr(,"tzone")
## [1] "UTC"
## attr(,"index")attr(,"tclass")
## [1] "Date"
Read more about these objects here.
xts
prices data to wideWe want to take our xts
data (biotech_prices
) and convert it to a wide dataset of stock prices.
xts
to data.frame
We can do this by combining the xts::to.monthly()
function with base::data.frame()
. We’ll use index(.)
to shift the dates to a date
column.
biotech_prices %>%
xts::to.monthly(indexAt = "lastof",
OHLC = FALSE) %>%
# convert the index to a date
base::data.frame(date = index(.))
remove_rownames
Now that we have the dates stored in date
, we want to drop the rownames with remove_rownames()
and rearrange the data with select()
.
biotech_prices %>%
xts::to.monthly(indexAt = "lastof",
OHLC = FALSE) %>%
# convert the index to a date
base::data.frame(date = index(.)) %>%
# now remove the index because it got converted to row names
tibble::remove_rownames() %>%
dplyr::select(date,
dplyr::everything())
PricesWide
Assign this to PricesWide
biotech_prices %>%
xts::to.monthly(indexAt = "lastof",
OHLC = FALSE) %>%
# convert the index to a date
base::data.frame(date = index(.)) %>%
# now remove the index because it got converted to row names
tibble::remove_rownames() %>%
dplyr::select(date,
dplyr::everything()) -> PricesWide
PricesWide
With the PricesWide
dataset, we can create a wide dataset of log returns. This involves log()
transforming the prices, and calculating the month-to-month difference.
We’ll want PricesWide
in a long format to calculate the return
variable.
pivot_longer
First we need to pivot the data from wide to long with tidyr::pivot_longer()
.
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date)
Now we can see the stock symbols have been moved into asset
and all the prices have been stacked in prices
.
asset
(Symbols)To calculate the return
s, we group by the stock symbols in asset
, then use mutate()
to calculate the difference between log(prices)
and log(lag(prices))
We can use log()
and lag()
inside mutate()
to calculate returns
.
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date) %>%
# group by asset
dplyr::group_by(asset) %>%
# calculate log returns and add those returns to the data frame
dplyr::mutate(returns = (log(prices) - log(lag(prices))))
Notice all the data are missing from 2019-12-31
? Well, this makes sense if you know how dplyr::lag()
works. The next section walks through an example.
dplyr::lag()
The dplyr::lag()
function is great for calculating changes across time-series. We want to use it to calculate return
s, which is the difference in price from month-to-month.
bmrn_prices
To demonstrate how dplyr::lag()
works, we’ll look at a small subset of BMRN
prices (in bmrn_prices
).
bmrn_prices <- c(83.50000, 90.37000, 84.50000,
92.02000, 106.55000, 123.34000)
bmrn_prices
## [1] 83.50 90.37 84.50 92.02 106.55 123.34
First we will use dplyr::lag()
on the bmrn_prices
bmrn_prices - dplyr::lag(bmrn_prices)
## [1] NA 6.87 -5.87 7.52 14.53 16.79
Once again, we see only five values are returned (even though we started out with 6 elements in bmrn_prices
).
lag()
internalsWe see missing data because dplyr::lag()
works using the following combination of elements in the bmrn_prices
vector:
# difference between 2 and 1
bmrn_prices[2] - bmrn_prices[1]
## [1] 6.87
# difference between 3 and 2
bmrn_prices[3] - bmrn_prices[2]
## [1] -5.87
# difference between 4 and 3
bmrn_prices[4] - bmrn_prices[3]
## [1] 7.52
# difference between 5 and 4
bmrn_prices[5] - bmrn_prices[4]
## [1] 14.53
# difference between 6 and 5
bmrn_prices[6] - bmrn_prices[5]
## [1] 16.79
log()
and lag()
And when we combine log()
with mutate()
function, the lag()
function produces the following:
log(bmrn_prices) - log(dplyr::lag(bmrn_prices))
## [1] NA 0.07906572 -0.06716082 0.08525441 0.14660841 0.14633041
Notice the order of the log()
vs. dplyr::lag()
# difference between log 2 and log 1
log(bmrn_prices[2]) - log(bmrn_prices[1])
## [1] 0.07906572
# log difference between log 3 and log 2
log(bmrn_prices[3]) - log(bmrn_prices[2])
## [1] -0.06716082
# difference between log 4 and log 3
log(bmrn_prices[4]) - log(bmrn_prices[3])
## [1] 0.08525441
# difference between log 5 and log 4
log(bmrn_prices[5]) - log(bmrn_prices[4])
## [1] 0.1466084
# difference between log 6 and log 5
log(bmrn_prices[6]) - log(bmrn_prices[5])
## [1] 0.1463304
mutate()
with lag()
Now we know why we see the following missing values in returns
on 2019-12-31
. These represent the fact we can’t calculate a difference without two values in prices.
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date) %>%
# group by asset
dplyr::group_by(asset) %>%
# calculate log returns and add those returns to the data frame
dplyr::mutate(returns = (log(prices) - log(lag(prices))))
Now that we have converted the original xts
data to a long (tidy) dataset with the returns
and prices
indexed by date and stock symbol (stored in asset
). However, we want this converted back to a wide dataset.
prices
Before we convert the long data back to a wide dataset, we need to remove the prices
column because this will be a dataset of returns (we can join these together later).
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date) %>%
# group by asset
dplyr::group_by(asset) %>%
# calculate log returns and add those returns to the data frame
dplyr::mutate(returns = (log(prices) - log(lag(prices)))) %>%
# remove prices
dplyr::select(-prices)
pivot_wider
Now we can restructure (or pivot_wider()
) the data and place the stock symbols in each column.
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date) %>%
# group by asset
dplyr::group_by(asset) %>%
# calculate log returns and add those returns to the data frame
dplyr::mutate(returns = (log(prices) - log(lag(prices)))) %>%
dplyr::select(-prices) %>%
# restructure to wide
tidyr::pivot_wider(names_from = asset,
values_from = returns)
ReturnsWide
The output looks correct, so we will assign the data to ReturnsWide
.
PricesWide %>%
tidyr::pivot_longer(names_to = "asset",
values_to = "prices",
cols = -date) %>%
dplyr::group_by(asset) %>%
dplyr::mutate(returns = (log(prices) - log(lag(prices)))) %>%
dplyr::select(-prices) %>%
tidyr::pivot_wider(names_from = asset,
values_from = returns) -> ReturnsWide
ReturnsWide
Now that we have two wide datasets for (PricesWide
and ReturnsWide
), we will convert them to long and join them into a single dataset with both prices
and returns
.
We’ve already covered the tidyr::pivot_longer()
function, but just remember this function requires two columns: one for names_to
(asset
) and one for values_to
(adj_prices
)
PricesTidy
Below we pivot PricesWide
into a long format dataset with date
, asset
, and adj_prices
–we will call this PricesTidy
.
PricesWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "adj_prices",
cols = -date) -> PricesTidy
PricesTidy
ReturnsTidy
We repeat the process from above, but this time we assign the names_to
as asset
and the values_to
as log_returns
.
ReturnsWide %>%
# shift to long
tidyr::pivot_longer(names_to = "asset",
values_to = "log_returns",
cols = -date) -> ReturnsTidy
ReturnsTidy
Now we can join the two Tidy
datasets together on date
and asset
and name the full dataset BioTechTidy
.
inner_join(x = PricesTidy,
y = ReturnsTidy,
by = c("date", "asset")) %>%
filter(!is.na(log_returns)) -> BioTechTidy
BioTechTidy
gt
packageThe gt
package comes from RStudio and it’s philosophy is straightforward: tables can be assembled from their various parts,
the table header, the stub, the column labels and spanner column labels, the table body, and the table footer.
We will start by using the gt
package with our biotech stock data.
We’ll only be displaying data for 2020
, so we will filter out the row for 2019-12-31
.
# remove missing
PricesWide <- PricesWide %>% filter(date != "2019-12-31")
PricesWide
We will also create two date vectors (begin_date
and end_date
) to use in the table title.
begin_date <- min(PricesWide$date)
begin_date
## [1] "2020-01-31"
end_date <- max(PricesWide$date)
end_date
## [1] "2020-12-31"
gt
table headerWe will pass the PricesWide
data to gt()
and tag_header()
functions to create a basic gt
table. We use the glue::glue()
function to create the subtitle
.
See the code below to understand how glue::glue()
works.
# recall we've created begin_date and end_date
glue::glue("From {begin_date} to {end_date}")
## From 2020-01-31 to 2020-12-31
Now we create a gt
table with a header defined by tab_header()
PricesWide %>%
# create gt table
gt() %>%
# create header
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}"))
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BMRN | NVS | GSK | AZN | LLY |
2020-01-31 | 83.50 | 91.20046 | 44.54283 | 47.34891 | 136.8949 |
2020-02-29 | 90.37 | 81.01990 | 39.11396 | 43.45262 | 124.2914 |
2020-03-31 | 84.50 | 82.45000 | 36.56619 | 44.30580 | 136.6979 |
2020-04-30 | 92.02 | 84.73000 | 40.60016 | 51.86537 | 152.3858 |
2020-05-31 | 106.55 | 87.43000 | 40.90715 | 54.16697 | 151.4300 |
2020-06-30 | 123.34 | 87.34000 | 39.80446 | 52.47053 | 162.5484 |
2020-07-31 | 119.81 | 82.14000 | 39.34581 | 55.33761 | 148.7964 |
2020-08-31 | 78.03 | 86.06000 | 39.09855 | 56.00000 | 147.6314 |
2020-09-30 | 76.08 | 86.96000 | 37.16337 | 54.80000 | 147.2633 |
2020-10-31 | 74.43 | 78.08000 | 32.99681 | 50.16000 | 129.7931 |
2020-11-30 | 78.70 | 90.83000 | 36.79000 | 52.94000 | 145.6500 |
2020-12-31 | 86.18 | 93.35000 | 36.98000 | 49.90000 | 166.5800 |
gt::tab_header()
controls the table header (which was one of the table parts we introduced into the philosophy).
gt
The formatting functions in gt
start with a fmt_
prefix. Some of the options include fmt_number
, fmt_date
, and fmt_currency
.
We will use gt::fmt_number()
to reduce the decimal
places down to 2
in the stock symbol columns. We give the columns we want formatted to the columns = vars()
argument.
PricesWide %>%
# create gt table
gt() %>%
# create header
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}")) %>%
# format number
fmt_number(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
decimals = 2
)
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BMRN | NVS | GSK | AZN | LLY |
2020-01-31 | 83.50 | 91.20 | 44.54 | 47.35 | 136.89 |
2020-02-29 | 90.37 | 81.02 | 39.11 | 43.45 | 124.29 |
2020-03-31 | 84.50 | 82.45 | 36.57 | 44.31 | 136.70 |
2020-04-30 | 92.02 | 84.73 | 40.60 | 51.87 | 152.39 |
2020-05-31 | 106.55 | 87.43 | 40.91 | 54.17 | 151.43 |
2020-06-30 | 123.34 | 87.34 | 39.80 | 52.47 | 162.55 |
2020-07-31 | 119.81 | 82.14 | 39.35 | 55.34 | 148.80 |
2020-08-31 | 78.03 | 86.06 | 39.10 | 56.00 | 147.63 |
2020-09-30 | 76.08 | 86.96 | 37.16 | 54.80 | 147.26 |
2020-10-31 | 74.43 | 78.08 | 33.00 | 50.16 | 129.79 |
2020-11-30 | 78.70 | 90.83 | 36.79 | 52.94 | 145.65 |
2020-12-31 | 86.18 | 93.35 | 36.98 | 49.90 | 166.58 |
We can change the dates in date
using the gt::fmt_date()
. The columns
argument needs vars(date)
, and
PricesWide %>%
# create gt table
gt() %>%
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}")) %>%
# format numbers
fmt_number(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
decimals = 2
) %>%
# format date
fmt_date(
columns = vars(date),
date_style = 5
)
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BMRN | NVS | GSK | AZN | LLY |
January 31, 2020 | 83.50 | 91.20 | 44.54 | 47.35 | 136.89 |
February 29, 2020 | 90.37 | 81.02 | 39.11 | 43.45 | 124.29 |
March 31, 2020 | 84.50 | 82.45 | 36.57 | 44.31 | 136.70 |
April 30, 2020 | 92.02 | 84.73 | 40.60 | 51.87 | 152.39 |
May 31, 2020 | 106.55 | 87.43 | 40.91 | 54.17 | 151.43 |
June 30, 2020 | 123.34 | 87.34 | 39.80 | 52.47 | 162.55 |
July 31, 2020 | 119.81 | 82.14 | 39.35 | 55.34 | 148.80 |
August 31, 2020 | 78.03 | 86.06 | 39.10 | 56.00 | 147.63 |
September 30, 2020 | 76.08 | 86.96 | 37.16 | 54.80 | 147.26 |
October 31, 2020 | 74.43 | 78.08 | 33.00 | 50.16 | 129.79 |
November 30, 2020 | 78.70 | 90.83 | 36.79 | 52.94 | 145.65 |
December 31, 2020 | 86.18 | 93.35 | 36.98 | 49.90 | 166.58 |
We also want to makes sure the numbers are formatted as USD
with gt::fmt_currency()
PricesWide %>%
# create gt table
gt() %>%
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}")) %>%
# format numbers
fmt_number(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
decimals = 2
) %>%
# format date
fmt_date(
columns = vars(date),
date_style = 5
) %>%
# format currency
fmt_currency(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
currency = "USD")
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BMRN | NVS | GSK | AZN | LLY |
January 31, 2020 | $83.50 | $91.20 | $44.54 | $47.35 | $136.89 |
February 29, 2020 | $90.37 | $81.02 | $39.11 | $43.45 | $124.29 |
March 31, 2020 | $84.50 | $82.45 | $36.57 | $44.31 | $136.70 |
April 30, 2020 | $92.02 | $84.73 | $40.60 | $51.87 | $152.39 |
May 31, 2020 | $106.55 | $87.43 | $40.91 | $54.17 | $151.43 |
June 30, 2020 | $123.34 | $87.34 | $39.80 | $52.47 | $162.55 |
July 31, 2020 | $119.81 | $82.14 | $39.35 | $55.34 | $148.80 |
August 31, 2020 | $78.03 | $86.06 | $39.10 | $56.00 | $147.63 |
September 30, 2020 | $76.08 | $86.96 | $37.16 | $54.80 | $147.26 |
October 31, 2020 | $74.43 | $78.08 | $33.00 | $50.16 | $129.79 |
November 30, 2020 | $78.70 | $90.83 | $36.79 | $52.94 | $145.65 |
December 31, 2020 | $86.18 | $93.35 | $36.98 | $49.90 | $166.58 |
We have two different categories of variables in the PricesWide
dataset, dates (i.e. months) and stock symbols. Adding a label above each group of columns would make the table easier to read. In the gt
package philosophy, these are called spanner columns.
In Excel, adding a label for a group of columns usually involves inserting a row above the date
and stock symbol columns,
The next step is merging the columns above each respective column type…
Finally, we add a new title in the merged rows (above our respective columns).
gt
)In gt
, we can add a spanner column with tab_spanner()
. We give this function a label
(like the example above) and the columns
we want to include under the label.
PricesWide %>%
# create gt table
gt() %>%
# create header
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}")) %>%
# format numbers
fmt_number(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
decimals = 2
) %>%
# format date
fmt_date(
columns = vars(date),
date_style = 5
) %>%
# format currency
fmt_currency(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
currency = "USD") %>%
# add spanner column label
tab_spanner(
label = "BioTech Stocks",
columns = vars(BMRN, NVS, GSK, AZN, LLY)
)
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BioTech Stocks | ||||
BMRN | NVS | GSK | AZN | LLY | |
January 31, 2020 | $83.50 | $91.20 | $44.54 | $47.35 | $136.89 |
February 29, 2020 | $90.37 | $81.02 | $39.11 | $43.45 | $124.29 |
March 31, 2020 | $84.50 | $82.45 | $36.57 | $44.31 | $136.70 |
April 30, 2020 | $92.02 | $84.73 | $40.60 | $51.87 | $152.39 |
May 31, 2020 | $106.55 | $87.43 | $40.91 | $54.17 | $151.43 |
June 30, 2020 | $123.34 | $87.34 | $39.80 | $52.47 | $162.55 |
July 31, 2020 | $119.81 | $82.14 | $39.35 | $55.34 | $148.80 |
August 31, 2020 | $78.03 | $86.06 | $39.10 | $56.00 | $147.63 |
September 30, 2020 | $76.08 | $86.96 | $37.16 | $54.80 | $147.26 |
October 31, 2020 | $74.43 | $78.08 | $33.00 | $50.16 | $129.79 |
November 30, 2020 | $78.70 | $90.83 | $36.79 | $52.94 | $145.65 |
December 31, 2020 | $86.18 | $93.35 | $36.98 | $49.90 | $166.58 |
Inserting meta-data (more labels, footnotes, etc.) in Excel involves using the GUI to add rows or columns (following the steps outlined above). We’re going to cover how to include meta-data and row group labels in gt
.
gt
)It’s good practice to include the source of the data in a table display. In gt
we can simply add a tab_source_note()
function and include a text string for the source_note =
.
PricesWide %>%
# create gt table
gt() %>%
# create header
tab_header(
title = "Adjusted Prices for BioTech Stocks",
subtitle = glue::glue("From {begin_date} to {end_date}")) %>%
# format numbers
fmt_number(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
decimals = 2
) %>%
# format date
fmt_date(
columns = vars(date),
date_style = 5
) %>%
# format currency
fmt_currency(
columns = vars(BMRN, NVS, GSK, AZN, LLY),
currency = "USD") %>%
# add spanner column label
tab_spanner(
label = "BioTech Stocks",
columns = vars(BMRN, NVS, GSK, AZN, LLY)
) %>%
# add data source
tab_source_note(
source_note = "Source: Yahoo finance data and xts package.")
Adjusted Prices for BioTech Stocks | |||||
---|---|---|---|---|---|
From 2020-01-31 to 2020-12-31 | |||||
date | BioTech Stocks | ||||
BMRN | NVS | GSK | AZN | LLY | |
January 31, 2020 | $83.50 | $91.20 | $44.54 | $47.35 | $136.89 |
February 29, 2020 | $90.37 | $81.02 | $39.11 | $43.45 | $124.29 |
March 31, 2020 | $84.50 | $82.45 | $36.57 | $44.31 | $136.70 |
April 30, 2020 | $92.02 | $84.73 | $40.60 | $51.87 | $152.39 |
May 31, 2020 | $106.55 | $87.43 | $40.91 | $54.17 | $151.43 |
June 30, 2020 | $123.34 | $87.34 | $39.80 | $52.47 | $162.55 |
July 31, 2020 | $119.81 | $82.14 | $39.35 | $55.34 | $148.80 |
August 31, 2020 | $78.03 | $86.06 | $39.10 | $56.00 | $147.63 |
September 30, 2020 | $76.08 | $86.96 | $37.16 | $54.80 | $147.26 |
October 31, 2020 | $74.43 | $78.08 | $33.00 | $50.16 | $129.79 |
November 30, 2020 | $78.70 | $90.83 | $36.79 | $52.94 | $145.65 |
December 31, 2020 | $86.18 | $93.35 | $36.98 | $49.90 | $166.58 |
Source: Yahoo finance data and xts package. |
gt
)Another way to add a label for particular columns is to use the tab_stubhead()
or tab_row_group()
functions.
tab_stubhead()
takes a label
argument and gets the name of the column we want to insert the label above. tab_row_group()
takes a group
argument, which is the name of the label for the groups of rows, and the rows
argument which can be either 1) a vector of rows inside c()
, 2) a tidyselect
helper function, or 3) a set of row indexes (we use this one).
We’re going to demonstrate the tab_stubhead()
and tab_row_group()
functions, we’re going to use the BioTechTidy
dataset we created earlier.
BioTechTidy %>%
# cleaner column names
select(
Date = date,
`Stock Symbol` = asset,
`Adjusted Prices` = adj_prices,
`log(Returns)` = log_returns) %>%
# create gt table
gt() %>%
# create table header
tab_header(
title = "BioTech Stock Prices and Returns",
subtitle = "Ajusted monthly prices and log-adjusted returns") %>%
# add stubhead for Date
tab_stubhead(label = "Date") %>%
# format numbers
fmt_number(
columns = vars(`Adjusted Prices`, `log(Returns)`),
decimals = 2
) %>%
# format date
fmt_date(
columns = vars(Date),
date_style = 5
) %>%
# format currency
fmt_currency(
columns = vars(`Adjusted Prices`, `log(Returns)`),
currency = "USD") %>%
# row group for quarter 4
tab_row_group(
group = "Quarter 4 (OCT - DEC)",
rows = 46:60) %>%
# row group for quarter 3
tab_row_group(
group = "Quarter 3 (JUL - SEP)",
rows = 31:45) %>%
# row group for quarter 2
tab_row_group(
group = "Quarter 2 (APR - JUN)",
rows = 16:30) %>%
# row group for quarter 1
tab_row_group(
group = "Quarter 1 (JAN - MAR)",
rows = 1:15) %>%
# add source of data
tab_source_note(
source_note = "Source: Yahoo finance data and xts package.")
BioTech Stock Prices and Returns | |||
---|---|---|---|
Ajusted monthly prices and log-adjusted returns | |||
Date | Stock Symbol | Adjusted Prices | log(Returns) |
Quarter 1 (JAN - MAR) | |||
January 31, 2020 | BMRN | $83.50 | −$0.01 |
January 31, 2020 | NVS | $91.20 | −$0.00 |
January 31, 2020 | GSK | $44.54 | −$0.00 |
January 31, 2020 | AZN | $47.35 | −$0.02 |
January 31, 2020 | LLY | $136.89 | $0.06 |
February 29, 2020 | BMRN | $90.37 | $0.08 |
February 29, 2020 | NVS | $81.02 | −$0.12 |
February 29, 2020 | GSK | $39.11 | −$0.13 |
February 29, 2020 | AZN | $43.45 | −$0.09 |
February 29, 2020 | LLY | $124.29 | −$0.10 |
March 31, 2020 | BMRN | $84.50 | −$0.07 |
March 31, 2020 | NVS | $82.45 | $0.02 |
March 31, 2020 | GSK | $36.57 | −$0.07 |
March 31, 2020 | AZN | $44.31 | $0.02 |
March 31, 2020 | LLY | $136.70 | $0.10 |
Quarter 2 (APR - JUN) | |||
April 30, 2020 | BMRN | $92.02 | $0.09 |
April 30, 2020 | NVS | $84.73 | $0.03 |
April 30, 2020 | GSK | $40.60 | $0.10 |
April 30, 2020 | AZN | $51.87 | $0.16 |
April 30, 2020 | LLY | $152.39 | $0.11 |
May 31, 2020 | BMRN | $106.55 | $0.15 |
May 31, 2020 | NVS | $87.43 | $0.03 |
May 31, 2020 | GSK | $40.91 | $0.01 |
May 31, 2020 | AZN | $54.17 | $0.04 |
May 31, 2020 | LLY | $151.43 | −$0.01 |
June 30, 2020 | BMRN | $123.34 | $0.15 |
June 30, 2020 | NVS | $87.34 | −$0.00 |
June 30, 2020 | GSK | $39.80 | −$0.03 |
June 30, 2020 | AZN | $52.47 | −$0.03 |
June 30, 2020 | LLY | $162.55 | $0.07 |
Quarter 3 (JUL - SEP) | |||
July 31, 2020 | BMRN | $119.81 | −$0.03 |
July 31, 2020 | NVS | $82.14 | −$0.06 |
July 31, 2020 | GSK | $39.35 | −$0.01 |
July 31, 2020 | AZN | $55.34 | $0.05 |
July 31, 2020 | LLY | $148.80 | −$0.09 |
August 31, 2020 | BMRN | $78.03 | −$0.43 |
August 31, 2020 | NVS | $86.06 | $0.05 |
August 31, 2020 | GSK | $39.10 | −$0.01 |
August 31, 2020 | AZN | $56.00 | $0.01 |
August 31, 2020 | LLY | $147.63 | −$0.01 |
September 30, 2020 | BMRN | $76.08 | −$0.03 |
September 30, 2020 | NVS | $86.96 | $0.01 |
September 30, 2020 | GSK | $37.16 | −$0.05 |
September 30, 2020 | AZN | $54.80 | −$0.02 |
September 30, 2020 | LLY | $147.26 | −$0.00 |
Quarter 4 (OCT - DEC) | |||
October 31, 2020 | BMRN | $74.43 | −$0.02 |
October 31, 2020 | NVS | $78.08 | −$0.11 |
October 31, 2020 | GSK | $33.00 | −$0.12 |
October 31, 2020 | AZN | $50.16 | −$0.09 |
October 31, 2020 | LLY | $129.79 | −$0.13 |
November 30, 2020 | BMRN | $78.70 | $0.06 |
November 30, 2020 | NVS | $90.83 | $0.15 |
November 30, 2020 | GSK | $36.79 | $0.11 |
November 30, 2020 | AZN | $52.94 | $0.05 |
November 30, 2020 | LLY | $145.65 | $0.12 |
December 31, 2020 | BMRN | $86.18 | $0.09 |
December 31, 2020 | NVS | $93.35 | $0.03 |
December 31, 2020 | GSK | $36.98 | $0.01 |
December 31, 2020 | AZN | $49.90 | −$0.06 |
December 31, 2020 | LLY | $166.58 | $0.13 |
Source: Yahoo finance data and xts package. |