layout: true <!-- this adds the link footer to all slides, depends on footer-small class in css--> <div class="footer-small"><span>https://github.com/mjfrigaard/ph-lacounty-r/</div> --- name: title-slide class: title-slide, center, middle, inverse # R Markdown Tables #.fancy[Creating Tables in R Markdown] <br> .large[by Martin Frigaard] Written: October 03 2022 Updated: December 02 2022 .footer-large[.right[.fira[ <br><br><br><br><br>[Created using the "λέξις" theme](https://jhelvy.github.io/lexis/index.html#what-does-%CE%BB%CE%AD%CE%BE%CE%B9%CF%82-mean) ]]] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # Materials The slides are in the `slides.pdf` file -- The materials for this training are in the `worksheets` folder: ``` worksheets ├── import.Rmd ├── export.Rmd ├── objects.Rmd ├── rmd-basic.Rmd ├── rmd-tables.Rmd └── rmd-visualizations.Rmd ``` --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # Outline <br> .leftcol[ #### 1. Importing data #### 2. Common Data Objects #### 3. R Markdown ] -- .rightcol[ #### 4. R Markdown Data Visualizations #### 5. .red[R Markdown Tables] #### 6. Exporting Data ] --- background-image: url(www/pdg-hex.png) class: center, middle, inverse background-position: 96% 4% background-size: 6% # .large[R Markdown Tables] -- <br><br> .font90[.green[Open `rmd-tables.Rmd` to follow along]] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables Import the `nhanes_11_12` data from the `data/csv/` folder. .code70[ ```r nhanes_11_12 <- data.table::fread(input = "../data/csv/nhanes_11_12.csv") glimpse(nhanes_11_12) ``` ] .code60[ ``` Rows: 2,121 Columns: 16 $ id <int> 62164, 62176, 62184, 62199, 62200, 62202, 62205, 6220… $ gender <chr> "female", "female", "male", "male", "male", "male", "… $ age <int> 44, 34, 26, 57, 42, 36, 28, 38, 77, 31, 29, 50, 31, 2… $ race3 <chr> "White", "White", "Black", "White", "Asian", "Mexican… $ education <chr> "Some College", "College Grad", "High School", "Colle… $ hh_income <chr> "45000-54999", "more 99999", "more 99999", "more 9999… $ poverty <dbl> 1.67, 5.00, 3.85, 5.00, 4.07, 2.83, 5.00, 1.53, 0.97,… $ weight <dbl> 67.2, 68.7, 68.9, 96.9, 73.7, 80.2, 84.8, 63.2, 69.8,… $ height <dbl> 170.1, 171.6, 176.6, 186.0, 163.4, 180.1, 171.4, 168.… $ bp_sys_ave <int> 119, 107, 121, 110, 127, 136, 122, 105, 131, 120, 99,… $ bp_dia_ave <int> 62, 69, 68, 65, 88, 48, 87, 59, 56, 71, 61, 74, 84, 7… $ testosterone <dbl> 44.35, 21.11, 746.95, 269.24, 256.92, 272.46, 466.11,… $ tot_chol <dbl> 4.91, 4.42, 4.81, 4.42, 5.77, 7.29, 5.46, 5.25, 4.47,… $ diabetes <chr> "No", "No", "No", "No", "No", "No", "No", "No", "Yes"… $ sleep_hrs_night <int> 8, 7, 8, 8, 6, 8, 6, 5, 8, 6, 8, 7, 6, 5, 8, 8, 7, 8,… $ phys_active_days <int> 5, 7, 3, 2, 4, 2, 1, 1, 7, 4, 2, 5, 4, 7, 4, 3, 4, 6,… ``` ] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables Check out the codebook in the `rmd-tables.Rmd` file to get a better understanding of the NHANES variables. -- ### Calculate Descriptives Below we calculate the descriptive statistics with some help from [`dplyr`](https://dplyr.tidyverse.org/) -- ```r descriptives <- nhanes_11_12 |> mutate(ht_meters = height*0.01, bmi = weight/ht_meters^2) |> group_by(diabetes) |> summarise(n = n(), across(c(age, bmi, bp_sys_ave, tot_chol), mean, na.rm = TRUE)) ``` --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables .leftcol[ If we print `descriptives` to the console, we see a `tibble` output <br> .code70[ ```r descriptives ``` ] .code55[ ``` # A tibble: 2 × 6 diabetes n age bmi bp_sys_ave tot_chol <chr> <int> <dbl> <dbl> <dbl> <dbl> 1 No 1890 43.1 27.7 120. 5.00 2 Yes 231 57.8 31.5 128. 4.83 ``` ] ] -- .rightcol[ We can use `knitr::kable()` to get a basic formatted table. <br> .font70[ ```r knitr::kable(descriptives) ``` <table> <thead> <tr> <th style="text-align:left;"> diabetes </th> <th style="text-align:right;"> n </th> <th style="text-align:right;"> age </th> <th style="text-align:right;"> bmi </th> <th style="text-align:right;"> bp_sys_ave </th> <th style="text-align:right;"> tot_chol </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No </td> <td style="text-align:right;"> 1890 </td> <td style="text-align:right;"> 43.12857 </td> <td style="text-align:right;"> 27.68338 </td> <td style="text-align:right;"> 119.7624 </td> <td style="text-align:right;"> 5.001571 </td> </tr> <tr> <td style="text-align:left;"> Yes </td> <td style="text-align:right;"> 231 </td> <td style="text-align:right;"> 57.83117 </td> <td style="text-align:right;"> 31.47395 </td> <td style="text-align:right;"> 127.7749 </td> <td style="text-align:right;"> 4.829480 </td> </tr> </tbody> </table> ] ] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables `knitr::kable()` allows us to adust the contents of the table with arguments like `digits` and `col.names` .code80[ ```r knitr::kable(descriptives, digits = 2, col.names = c("Diabetic", "N", "Age", "BMI", "Sys BP (Avg)", "Total Chol")) ``` <table> <thead> <tr> <th style="text-align:left;"> Diabetic </th> <th style="text-align:right;"> N </th> <th style="text-align:right;"> Age </th> <th style="text-align:right;"> BMI </th> <th style="text-align:right;"> Sys BP (Avg) </th> <th style="text-align:right;"> Total Chol </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> No </td> <td style="text-align:right;"> 1890 </td> <td style="text-align:right;"> 43.13 </td> <td style="text-align:right;"> 27.68 </td> <td style="text-align:right;"> 119.76 </td> <td style="text-align:right;"> 5.00 </td> </tr> <tr> <td style="text-align:left;"> Yes </td> <td style="text-align:right;"> 231 </td> <td style="text-align:right;"> 57.83 </td> <td style="text-align:right;"> 31.47 </td> <td style="text-align:right;"> 127.77 </td> <td style="text-align:right;"> 4.83 </td> </tr> </tbody> </table> ] <br> Read more about `kable` table options [here](https://bookdown.org/yihui/rmarkdown-cookbook/kable.html) --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables If we're looking at larger tables, we can use the `rmarkdown::paged_table()` function <br> .code80[ ```r big_descriptives <- nhanes_11_12 |> dplyr::select(-id) |> # remove id dplyr::mutate(ht_meters = height * 0.01, # calculate new vars bmi = weight / ht_meters ^ 2) |> dplyr::group_by(diabetes) |> # calculate by diabetes dplyr::summarise(n = n(), # get total dplyr::across(.cols = where(is.numeric), # all numeric variables .fns = mean, na.rm = TRUE)) # calculate mean ``` ] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables If we're looking at larger tables, we can use the `rmarkdown::paged_table()` function <br> ```r rmarkdown::paged_table(big_descriptives) ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["diabetes"],"name":[1],"type":["chr"],"align":["left"]},{"label":["n"],"name":[2],"type":["dbl"],"align":["right"]},{"label":["age"],"name":[3],"type":["dbl"],"align":["right"]},{"label":["poverty"],"name":[4],"type":["dbl"],"align":["right"]},{"label":["weight"],"name":[5],"type":["dbl"],"align":["right"]},{"label":["height"],"name":[6],"type":["dbl"],"align":["right"]},{"label":["bp_sys_ave"],"name":[7],"type":["dbl"],"align":["right"]},{"label":["bp_dia_ave"],"name":[8],"type":["dbl"],"align":["right"]},{"label":["testosterone"],"name":[9],"type":["dbl"],"align":["right"]},{"label":["tot_chol"],"name":[10],"type":["dbl"],"align":["right"]},{"label":["sleep_hrs_night"],"name":[11],"type":["dbl"],"align":["right"]},{"label":["phys_active_days"],"name":[12],"type":["dbl"],"align":["right"]},{"label":["ht_meters"],"name":[13],"type":["dbl"],"align":["right"]},{"label":["bmi"],"name":[14],"type":["dbl"],"align":["right"]}],"data":[{"1":"No","2":"1890","3":"43.12857","4":"2.879497","5":"79.54995","6":"169.2085","7":"119.7624","8":"71.18519","9":"238.5126","10":"5.001571","11":"6.852910","12":"3.597884","13":"1.692085","14":"27.68338"},{"1":"Yes","2":"231","3":"57.83117","4":"2.310563","5":"87.79481","6":"166.8013","7":"127.7749","8":"70.53247","9":"189.5030","10":"4.829481","11":"6.666667","12":"3.813853","13":"1.668013","14":"31.47395"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables The `gtsummary()` package has great functions for creating common tables. .leftcol[ ```r library(gtsummary) tbl_vars <- dplyr::select(nhanes_11_12, diabetes, race3) gtsummary::tbl_cross( data = tbl_vars, row = race3, col = diabetes) ``` ] .rightcol[
diabetes
Total
No
Yes
race3
Asian
308
32
340
Black
412
77
489
Hispanic
162
16
178
Mexican
154
27
181
Other
61
8
69
White
793
71
864
Total
1,890
231
2,121
] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables We can add lables with the [`labelled` package.](https://cran.r-project.org/web/packages/labelled/vignettes/intro_labelled.html) .leftcol[ ```r library(labelled) var_label(tbl_vars) <- list( race3 = "Race", diabetes = "Diabetes Status") tbl_cross(data = tbl_vars, row = race3, col = diabetes) ``` ] .rightcol[
Diabetes Status
Total
No
Yes
Race
Asian
308
32
340
Black
412
77
489
Hispanic
162
16
178
Mexican
154
27
181
Other
61
8
69
White
793
71
864
Total
1,890
231
2,121
] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables Options for adding percentages and p-values `add_p()` .leftcol[ ```r library(labelled) var_label(tbl_vars) <- list( race3 = "Race", diabetes = "Diabetes Status") tbl_cross(data = tbl_vars, row = race3, col = diabetes, percent = "cell") |> add_p() ``` ] .rightcol[
Diabetes Status
Total
p-value
1
No
Yes
Race
<0.001
Asian
308 (15%)
32 (1.5%)
340 (16%)
Black
412 (19%)
77 (3.6%)
489 (23%)
Hispanic
162 (7.6%)
16 (0.8%)
178 (8.4%)
Mexican
154 (7.3%)
27 (1.3%)
181 (8.5%)
Other
61 (2.9%)
8 (0.4%)
69 (3.3%)
White
793 (37%)
71 (3.3%)
864 (41%)
Total
1,890 (89%)
231 (11%)
2,121 (100%)
1
Pearson's Chi-squared test
] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables The `gt_summary::tbl_summary()` function creates publication-quality summary tables with multiple statistical summary options. .leftcol[ .code70[ ```r tbl_vars <- nhanes_11_12 |> dplyr::mutate(ht_meters = height*0.01, bmi = weight/ht_meters^2) |> dplyr::select(age, gender, bmi, bp_sys_ave, bp_dia_ave) var_label(tbl_vars) <- list( age = "Age", gender = "Gender", bmi = "BMI", bp_sys_ave = "Sys Avg", bp_dia_ave = "Dia Avg") ``` ] ] .rightcol[ ```r gtsummary::tbl_summary(tbl_vars) ```
Characteristic
N = 2,121
1
Age
42 (30, 58)
Gender
female
1,008 (48%)
male
1,113 (52%)
BMI
27.0 (23.7, 31.3)
Sys Avg
118 (110, 129)
Dia Avg
71 (64, 78)
1
Median (IQR); n (%)
] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables We can also create grouped summaries with `gtsummary::tbl_summary(by = )`: .leftcol[ .code70[ ```r tbl_vars <- nhanes_11_12 |> dplyr::mutate( ht_meters = height*0.01, bmi = weight/ht_meters^2, diabetes = if_else(diabetes == "Yes", "Diabetic", "Healthy")) |> dplyr::select(age, diabetes, gender, bmi, bp_sys_ave, bp_dia_ave) var_label(tbl_vars) <- list( age = "Age", diabetes = "Diabetic", gender = "Gender", bmi = "BMI", bp_sys_ave = "Sys Avg", bp_dia_ave = "Dia Avg") ``` ] ] .rightcol[ ```r gtsummary::tbl_summary(tbl_vars, by = diabetes) ```
Characteristic
Diabetic
, N = 231
1
Healthy
, N = 1,890
1
Age
60 (50, 68)
40 (29, 56)
Gender
female
112 (48%)
896 (47%)
male
119 (52%)
994 (53%)
BMI
30.2 (26.5, 35.4)
26.6 (23.4, 30.7)
Sys Avg
127 (116, 136)
118 (109, 128)
Dia Avg
70 (63, 79)
71 (64, 78)
1
Median (IQR); n (%)
] --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables ### R Markdown also supports inline R code -- <img src="www/inline-r-code-01.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables ### Inline R code allows us to include summaries of our analysis in the report -- <img src="www/inline-r-code-02.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- background-image: url(www/pdg-hex.png) background-position: 96% 4% background-size: 6% # R Markdown Tables ### We're going to add the summary statistics in `descriptives` to our `rmd-tables.Rmd` report. .leftcol[ .font90[Include the following code under the `Summary` two-level header] .code60[ ```r descriptives ``` ``` # A tibble: 2 × 6 diabetes n age bmi bp_sys_ave tot_chol <chr> <int> <dbl> <dbl> <dbl> <dbl> 1 No 1890 43.1 27.7 120. 5.00 2 Yes 231 57.8 31.5 128. 4.83 ``` ] ] -- .rightcol[ ``` The average age in `nhanes_11_12` is 44.73. The correlation between age and average systolic blood pressure is 0.42 ``` ]