class: center, middle, inverse, title-slide # Introduction to Data Visualization with ggplot2 ## bmRn CSM: An introduction to the grammar of graphics ### Martin Frigaard ### 2020-12-13 --- <img src="img/ggplot2_masterpiece.png" width="90%" height="90%" style="display: block; margin: auto;" /> .pull-right[Art by Allison Horst] --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 70% background-size: 30% # Intro to Data Visualization with R with `ggplot2` --- class: left, top background-image: url(img/ggplot2.png) background-position: 90% 7% background-size: 8% # Objectives ## 1) Introduce the grammar of graphics ## 2) Identifying graph aesthetics ## 3) Recognizing and using `geoms` ## 4) Facetting graphs --- class: left, top background-image: url(img/ggplot2.png) background-position: 90% 7% background-size: 8% # Resources ## Link to these slides https://mjfrigaard.github.io/data-viz-intro/Index.html ## Link to exercises https://mjfrigaard.github.io/r-meetup-tutorials/gg00_intro-to-data-viz.html ## Link to RStudio Project https://rstudio.cloud/project/1973650 --- class: left, top background-image: url(img/ggplot2.png) background-position: 90% 7% background-size: 8% # Why use `ggplot2` for data visualization? -- ### 1) `ggplot2` provides a comprehensive grammar for creating graphs/figures -- ### 2) It works hand-and-hand with the `tidyverse` -- ### 3) Better plots = better communication --- class: left, top background-image: url(img/ggplot2.png) background-position: 90% 7% background-size: 8% # Why do we create data visualizations? -- ### Clarification > "*The simple graph has brought more information to the data analyst’s mind than any other device.*" - John Tukey -- ### Better decision making > "*Data visualization is a collection of methods that use visual representations to explore, make sense of, and communicate quantitative data…The ultimate purpose of data visualization, beyond understanding, is to enable better decisions and actions.*" - Stephen Few --- class: left, top background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% # How should we start creating data visualizations? -- ## Start with pen and paper -- ### *get those first bad ideas out of the way* -- ## Import and inspect your data -- ### *so you know what to expect* --- background-image: url(img/ggplot2.png) background-position: 96% 8% background-size: 8% ## Layered grammar of graphics -- > *"appreciating the engineering design behind a sentence – **a linear ordering of phrases which conveys a gnarly network of ideas** — is the key to understanding what you are trying to accomplish when you compose a sentence."* - Stephen Pinker -- > *"language is a system for making infinite use of finite means."* - Wilhelm von Humboldt -- ### `ggplot2` is a language of *layers*, organized linearly -- `ggplot2`'s layers give us a "*linear ordering of phrases*" to build an infinite number of graphs "*which convey a gnarly network of ideas.*" ### ...infinitely extensible --- background-image: url(img/ggplot2.png) background-position: 96% 8% background-size: 6% ### Let's load some data! The `NHANES` package comes with data from the [2014 American National Health and Nutrition Examination surveys](http://www.cdc.gov/nchs/data/series/sr_02/sr02_162.pdf). We will load a sample from it below: ```r SmallNhanes <- read_csv("https://bit.ly/nhanes-small") SmallNhanes ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["ID"],"name":[1],"type":["dbl"],"align":["right"]},{"label":["Gender"],"name":[2],"type":["chr"],"align":["left"]},{"label":["Age"],"name":[3],"type":["dbl"],"align":["right"]},{"label":["AgeDecade"],"name":[4],"type":["chr"],"align":["left"]},{"label":["Race1"],"name":[5],"type":["chr"],"align":["left"]},{"label":["HealthGen"],"name":[6],"type":["chr"],"align":["left"]},{"label":["Height"],"name":[7],"type":["dbl"],"align":["right"]},{"label":["BMI"],"name":[8],"type":["dbl"],"align":["right"]},{"label":["Weight"],"name":[9],"type":["dbl"],"align":["right"]},{"label":["Pulse"],"name":[10],"type":["dbl"],"align":["right"]},{"label":["BPSysAve"],"name":[11],"type":["dbl"],"align":["right"]}],"data":[{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51625","2":"male","3":"4","4":"0-9","5":"Other","6":"NA","7":"105.4","8":"15.30","9":"17.0","10":"NA","11":"NA"},{"1":"51630","2":"female","3":"49","4":"40-49","5":"White","6":"Good","7":"168.4","8":"30.57","9":"86.7","10":"86","11":"112"},{"1":"51638","2":"male","3":"9","4":"0-9","5":"White","6":"NA","7":"133.1","8":"16.82","9":"29.8","10":"82","11":"86"},{"1":"51646","2":"male","3":"8","4":"0-9","5":"White","6":"NA","7":"130.6","8":"20.64","9":"35.2","10":"72","11":"107"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51654","2":"male","3":"66","4":"60-69","5":"White","6":"Vgood","7":"169.5","8":"23.67","9":"68.0","10":"60","11":"111"},{"1":"51656","2":"male","3":"58","4":"50-59","5":"White","6":"Vgood","7":"181.9","8":"23.69","9":"78.4","10":"62","11":"104"},{"1":"51657","2":"male","3":"54","4":"50-59","5":"White","6":"Fair","7":"169.4","8":"26.03","9":"74.7","10":"76","11":"134"},{"1":"51659","2":"female","3":"10","4":"10-19","5":"White","6":"NA","7":"141.8","8":"19.20","9":"38.6","10":"80","11":"104"},{"1":"51666","2":"female","3":"58","4":"50-59","5":"Mexican","6":"NA","7":"148.1","8":"26.22","9":"57.5","10":"94","11":"127"},{"1":"51667","2":"male","3":"50","4":"50-59","5":"White","6":"Good","7":"177.8","8":"26.60","9":"84.1","10":"74","11":"142"},{"1":"51671","2":"female","3":"9","4":"0-9","5":"Black","6":"NA","7":"139.2","8":"27.40","9":"53.1","10":"92","11":"93"},{"1":"51677","2":"male","3":"33","4":"30-39","5":"White","6":"Fair","7":"181.3","8":"28.54","9":"93.8","10":"96","11":"128"},{"1":"51678","2":"male","3":"60","4":"60-69","5":"White","6":"Vgood","7":"169.9","8":"25.84","9":"74.6","10":"84","11":"152"},{"1":"51679","2":"male","3":"16","4":"10-19","5":"Other","6":"Fair","7":"172.0","8":"24.74","9":"73.2","10":"76","11":"126"},{"1":"51685","2":"female","3":"56","4":"50-59","5":"White","6":"Good","7":"170.7","8":"19.73","9":"57.5","10":"64","11":"95"},{"1":"51685","2":"female","3":"56","4":"50-59","5":"White","6":"Good","7":"170.7","8":"19.73","9":"57.5","10":"64","11":"95"},{"1":"51691","2":"female","3":"57","4":"50-59","5":"White","6":"Excellent","7":"157.1","8":"20.66","9":"51.0","10":"70","11":"122"},{"1":"51692","2":"male","3":"54","4":"50-59","5":"Hispanic","6":"Good","7":"177.1","8":"36.32","9":"113.9","10":"64","11":"90"},{"1":"51692","2":"male","3":"54","4":"50-59","5":"Hispanic","6":"Good","7":"177.1","8":"36.32","9":"113.9","10":"64","11":"90"},{"1":"51694","2":"male","3":"38","4":"30-39","5":"White","6":"Poor","7":"180.9","8":"35.84","9":"117.3","10":"60","11":"147"},{"1":"51695","2":"male","3":"8","4":"0-9","5":"White","6":"NA","7":"128.4","8":"24.32","9":"40.1","10":"76","11":"93"},{"1":"51701","2":"male","3":"36","4":"30-39","5":"White","6":"Good","7":"174.7","8":"25.95","9":"79.2","10":"68","11":"117"},{"1":"51702","2":"male","3":"44","4":"40-49","5":"White","6":"Fair","7":"165.9","8":"31.43","9":"86.5","10":"80","11":"144"},{"1":"51702","2":"male","3":"44","4":"40-49","5":"White","6":"Fair","7":"165.9","8":"31.43","9":"86.5","10":"80","11":"144"},{"1":"51707","2":"female","3":"64","4":"60-69","5":"Other","6":"Good","7":"152.0","8":"27.18","9":"62.8","10":"94","11":"130"},{"1":"51710","2":"female","3":"26","4":"20-29","5":"White","6":"Excellent","7":"175.8","8":"21.00","9":"64.9","10":"94","11":"106"},{"1":"51711","2":"female","3":"59","4":"50-59","5":"Other","6":"NA","7":"145.1","8":"25.79","9":"54.3","10":"84","11":"150"},{"1":"51711","2":"female","3":"59","4":"50-59","5":"Other","6":"NA","7":"145.1","8":"25.79","9":"54.3","10":"84","11":"150"},{"1":"51715","2":"male","3":"49","4":"40-49","5":"Mexican","6":"Fair","7":"173.4","8":"29.13","9":"87.6","10":"70","11":"122"},{"1":"51718","2":"female","3":"51","4":"50-59","5":"Mexican","6":"Vgood","7":"163.8","8":"30.60","9":"82.1","10":"100","11":"105"},{"1":"51718","2":"female","3":"51","4":"50-59","5":"Mexican","6":"Vgood","7":"163.8","8":"30.60","9":"82.1","10":"100","11":"105"},{"1":"51719","2":"male","3":"17","4":"10-19","5":"White","6":"Vgood","7":"176.0","8":"23.34","9":"72.3","10":"74","11":"105"},{"1":"51720","2":"male","3":"12","4":"10-19","5":"Other","6":"Vgood","7":"158.9","8":"22.85","9":"57.7","10":"80","11":"121"},{"1":"51720","2":"male","3":"12","4":"10-19","5":"Other","6":"Vgood","7":"158.9","8":"22.85","9":"57.7","10":"80","11":"121"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51723","2":"male","3":"28","4":"20-29","5":"Black","6":"Excellent","7":"175.3","8":"25.45","9":"78.2","10":"74","11":"93"},{"1":"51724","2":"female","3":"37","4":"30-39","5":"Hispanic","6":"Vgood","7":"154.8","8":"21.16","9":"50.7","10":"82","11":"102"},{"1":"51731","2":"female","3":"28","4":"20-29","5":"Black","6":"Good","7":"169.6","8":"46.69","9":"134.3","10":"60","11":"103"},{"1":"51732","2":"male","3":"32","4":"30-39","5":"White","6":"Excellent","7":"173.0","8":"20.15","9":"60.3","10":"50","11":"124"},{"1":"51734","2":"male","3":"25","4":"20-29","5":"White","6":"Excellent","7":"178.6","8":"27.06","9":"86.3","10":"62","11":"117"},{"1":"51741","2":"female","3":"21","4":"20-29","5":"Black","6":"Fair","7":"166.5","8":"37.33","9":"103.5","10":"86","11":"120"},{"1":"51741","2":"female","3":"21","4":"20-29","5":"Black","6":"Fair","7":"166.5","8":"37.33","9":"103.5","10":"86","11":"120"},{"1":"51743","2":"male","3":"8","4":"0-9","5":"Hispanic","6":"NA","7":"125.6","8":"15.59","9":"24.6","10":"80","11":"90"},{"1":"51743","2":"male","3":"8","4":"0-9","5":"Hispanic","6":"NA","7":"125.6","8":"15.59","9":"24.6","10":"80","11":"90"},{"1":"51745","2":"male","3":"44","4":"40-49","5":"White","6":"Good","7":"182.0","8":"25.54","9":"84.6","10":"76","11":"156"},{"1":"51748","2":"male","3":"56","4":"50-59","5":"Mexican","6":"Fair","7":"170.6","8":"24.98","9":"72.7","10":"66","11":"86"},{"1":"51752","2":"female","3":"78","4":"70+","5":"Mexican","6":"Poor","7":"151.0","8":"22.63","9":"51.6","10":"66","11":"133"},{"1":"51753","2":"male","3":"3","4":"0-9","5":"White","6":"NA","7":"89.9","8":"14.35","9":"11.6","10":"NA","11":"NA"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> --- background-image: url(img/ggplot2.png) background-position: 95% 8% background-size: 6% ### Quick Tip: Column Names **Standardize names** ```r SmallNhanes <- SmallNhanes %>% janitor::clean_names() SmallNhanes ``` <div data-pagedtable="false"> <script data-pagedtable-source type="application/json"> {"columns":[{"label":["id"],"name":[1],"type":["dbl"],"align":["right"]},{"label":["gender"],"name":[2],"type":["chr"],"align":["left"]},{"label":["age"],"name":[3],"type":["dbl"],"align":["right"]},{"label":["age_decade"],"name":[4],"type":["chr"],"align":["left"]},{"label":["race1"],"name":[5],"type":["chr"],"align":["left"]},{"label":["health_gen"],"name":[6],"type":["chr"],"align":["left"]},{"label":["height"],"name":[7],"type":["dbl"],"align":["right"]},{"label":["bmi"],"name":[8],"type":["dbl"],"align":["right"]},{"label":["weight"],"name":[9],"type":["dbl"],"align":["right"]},{"label":["pulse"],"name":[10],"type":["dbl"],"align":["right"]},{"label":["bp_sys_ave"],"name":[11],"type":["dbl"],"align":["right"]}],"data":[{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51624","2":"male","3":"34","4":"30-39","5":"White","6":"Good","7":"164.7","8":"32.22","9":"87.4","10":"70","11":"113"},{"1":"51625","2":"male","3":"4","4":"0-9","5":"Other","6":"NA","7":"105.4","8":"15.30","9":"17.0","10":"NA","11":"NA"},{"1":"51630","2":"female","3":"49","4":"40-49","5":"White","6":"Good","7":"168.4","8":"30.57","9":"86.7","10":"86","11":"112"},{"1":"51638","2":"male","3":"9","4":"0-9","5":"White","6":"NA","7":"133.1","8":"16.82","9":"29.8","10":"82","11":"86"},{"1":"51646","2":"male","3":"8","4":"0-9","5":"White","6":"NA","7":"130.6","8":"20.64","9":"35.2","10":"72","11":"107"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51647","2":"female","3":"45","4":"40-49","5":"White","6":"Vgood","7":"166.7","8":"27.24","9":"75.7","10":"62","11":"118"},{"1":"51654","2":"male","3":"66","4":"60-69","5":"White","6":"Vgood","7":"169.5","8":"23.67","9":"68.0","10":"60","11":"111"},{"1":"51656","2":"male","3":"58","4":"50-59","5":"White","6":"Vgood","7":"181.9","8":"23.69","9":"78.4","10":"62","11":"104"},{"1":"51657","2":"male","3":"54","4":"50-59","5":"White","6":"Fair","7":"169.4","8":"26.03","9":"74.7","10":"76","11":"134"},{"1":"51659","2":"female","3":"10","4":"10-19","5":"White","6":"NA","7":"141.8","8":"19.20","9":"38.6","10":"80","11":"104"},{"1":"51666","2":"female","3":"58","4":"50-59","5":"Mexican","6":"NA","7":"148.1","8":"26.22","9":"57.5","10":"94","11":"127"},{"1":"51667","2":"male","3":"50","4":"50-59","5":"White","6":"Good","7":"177.8","8":"26.60","9":"84.1","10":"74","11":"142"},{"1":"51671","2":"female","3":"9","4":"0-9","5":"Black","6":"NA","7":"139.2","8":"27.40","9":"53.1","10":"92","11":"93"},{"1":"51677","2":"male","3":"33","4":"30-39","5":"White","6":"Fair","7":"181.3","8":"28.54","9":"93.8","10":"96","11":"128"},{"1":"51678","2":"male","3":"60","4":"60-69","5":"White","6":"Vgood","7":"169.9","8":"25.84","9":"74.6","10":"84","11":"152"},{"1":"51679","2":"male","3":"16","4":"10-19","5":"Other","6":"Fair","7":"172.0","8":"24.74","9":"73.2","10":"76","11":"126"},{"1":"51685","2":"female","3":"56","4":"50-59","5":"White","6":"Good","7":"170.7","8":"19.73","9":"57.5","10":"64","11":"95"},{"1":"51685","2":"female","3":"56","4":"50-59","5":"White","6":"Good","7":"170.7","8":"19.73","9":"57.5","10":"64","11":"95"},{"1":"51691","2":"female","3":"57","4":"50-59","5":"White","6":"Excellent","7":"157.1","8":"20.66","9":"51.0","10":"70","11":"122"},{"1":"51692","2":"male","3":"54","4":"50-59","5":"Hispanic","6":"Good","7":"177.1","8":"36.32","9":"113.9","10":"64","11":"90"},{"1":"51692","2":"male","3":"54","4":"50-59","5":"Hispanic","6":"Good","7":"177.1","8":"36.32","9":"113.9","10":"64","11":"90"},{"1":"51694","2":"male","3":"38","4":"30-39","5":"White","6":"Poor","7":"180.9","8":"35.84","9":"117.3","10":"60","11":"147"},{"1":"51695","2":"male","3":"8","4":"0-9","5":"White","6":"NA","7":"128.4","8":"24.32","9":"40.1","10":"76","11":"93"},{"1":"51701","2":"male","3":"36","4":"30-39","5":"White","6":"Good","7":"174.7","8":"25.95","9":"79.2","10":"68","11":"117"},{"1":"51702","2":"male","3":"44","4":"40-49","5":"White","6":"Fair","7":"165.9","8":"31.43","9":"86.5","10":"80","11":"144"},{"1":"51702","2":"male","3":"44","4":"40-49","5":"White","6":"Fair","7":"165.9","8":"31.43","9":"86.5","10":"80","11":"144"},{"1":"51707","2":"female","3":"64","4":"60-69","5":"Other","6":"Good","7":"152.0","8":"27.18","9":"62.8","10":"94","11":"130"},{"1":"51710","2":"female","3":"26","4":"20-29","5":"White","6":"Excellent","7":"175.8","8":"21.00","9":"64.9","10":"94","11":"106"},{"1":"51711","2":"female","3":"59","4":"50-59","5":"Other","6":"NA","7":"145.1","8":"25.79","9":"54.3","10":"84","11":"150"},{"1":"51711","2":"female","3":"59","4":"50-59","5":"Other","6":"NA","7":"145.1","8":"25.79","9":"54.3","10":"84","11":"150"},{"1":"51715","2":"male","3":"49","4":"40-49","5":"Mexican","6":"Fair","7":"173.4","8":"29.13","9":"87.6","10":"70","11":"122"},{"1":"51718","2":"female","3":"51","4":"50-59","5":"Mexican","6":"Vgood","7":"163.8","8":"30.60","9":"82.1","10":"100","11":"105"},{"1":"51718","2":"female","3":"51","4":"50-59","5":"Mexican","6":"Vgood","7":"163.8","8":"30.60","9":"82.1","10":"100","11":"105"},{"1":"51719","2":"male","3":"17","4":"10-19","5":"White","6":"Vgood","7":"176.0","8":"23.34","9":"72.3","10":"74","11":"105"},{"1":"51720","2":"male","3":"12","4":"10-19","5":"Other","6":"Vgood","7":"158.9","8":"22.85","9":"57.7","10":"80","11":"121"},{"1":"51720","2":"male","3":"12","4":"10-19","5":"Other","6":"Vgood","7":"158.9","8":"22.85","9":"57.7","10":"80","11":"121"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51722","2":"male","3":"16","4":"10-19","5":"White","6":"Excellent","7":"183.3","8":"26.46","9":"88.9","10":"58","11":"140"},{"1":"51723","2":"male","3":"28","4":"20-29","5":"Black","6":"Excellent","7":"175.3","8":"25.45","9":"78.2","10":"74","11":"93"},{"1":"51724","2":"female","3":"37","4":"30-39","5":"Hispanic","6":"Vgood","7":"154.8","8":"21.16","9":"50.7","10":"82","11":"102"},{"1":"51731","2":"female","3":"28","4":"20-29","5":"Black","6":"Good","7":"169.6","8":"46.69","9":"134.3","10":"60","11":"103"},{"1":"51732","2":"male","3":"32","4":"30-39","5":"White","6":"Excellent","7":"173.0","8":"20.15","9":"60.3","10":"50","11":"124"},{"1":"51734","2":"male","3":"25","4":"20-29","5":"White","6":"Excellent","7":"178.6","8":"27.06","9":"86.3","10":"62","11":"117"},{"1":"51741","2":"female","3":"21","4":"20-29","5":"Black","6":"Fair","7":"166.5","8":"37.33","9":"103.5","10":"86","11":"120"},{"1":"51741","2":"female","3":"21","4":"20-29","5":"Black","6":"Fair","7":"166.5","8":"37.33","9":"103.5","10":"86","11":"120"},{"1":"51743","2":"male","3":"8","4":"0-9","5":"Hispanic","6":"NA","7":"125.6","8":"15.59","9":"24.6","10":"80","11":"90"},{"1":"51743","2":"male","3":"8","4":"0-9","5":"Hispanic","6":"NA","7":"125.6","8":"15.59","9":"24.6","10":"80","11":"90"},{"1":"51745","2":"male","3":"44","4":"40-49","5":"White","6":"Good","7":"182.0","8":"25.54","9":"84.6","10":"76","11":"156"},{"1":"51748","2":"male","3":"56","4":"50-59","5":"Mexican","6":"Fair","7":"170.6","8":"24.98","9":"72.7","10":"66","11":"86"},{"1":"51752","2":"female","3":"78","4":"70+","5":"Mexican","6":"Poor","7":"151.0","8":"22.63","9":"51.6","10":"66","11":"133"},{"1":"51753","2":"male","3":"3","4":"0-9","5":"White","6":"NA","7":"89.9","8":"14.35","9":"11.6","10":"NA","11":"NA"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"},{"1":"51754","2":"female","3":"58","4":"50-59","5":"White","6":"NA","7":"167.2","8":"37.92","9":"106.0","10":"70","11":"144"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}} </script> </div> --- background-image: url(img/ggplot2.png) background-position: 95% 20% background-size: 8% ### Quick Tip: Factors **Format factors:** We have a `health_gen` variable with the following levels: `Excellent`, `Vgood`, `Good`, `Fair`, or `Poor`. These are ordered. -- ```r SmallNhanes <- SmallNhanes %>% mutate( health_gen = factor(x = health_gen, levels = c("Poor", "Fair", "Good", "Vgood", "Excellent"), ordered = TRUE)) ``` -- ```r levels(SmallNhanes$health_gen) ``` ``` ## [1] "Poor" "Fair" "Good" "Vgood" "Excellent" ``` --- background-image: url(img/ggplot2.png) background-position: 96% 8% background-size: 8% ## Layered grammar of graphics ### How it works: -- ### - Graphs are *initialized* with *ggplot()* -- ### - Variables are *mapped* to *aesthetics* -- ### - Geoms are linked to *statistics* --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 8% # Our First Graph <br> -- ## What relationship do you expect to see between `height` and `weight`? --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 8% ## 1. Use data with pipe to initialize graph `SmallNhanes %>%` -- ## 2. Map variables to aesthetics `SmallNhanes %>%` `ggplot(mapping = aes(x = weight, y = height))` -- ## 3. Add geoms and layers `SmallNhanes %>% ` `ggplot(mapping = aes(x = weight, y = height)) +` `geom_point()` --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 6% ```r SmallNhanes %>% * ggplot() # initialize ``` <img src="img/initialize-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 5% ```r SmallNhanes %>% * ggplot(mapping = aes(x = weight, y = height)) # map variables ``` <img src="img/mapping-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 6% ```r SmallNhanes %>% ggplot(mapping = aes(x = weight, y = height)) + * geom_point() # add geoms ``` <img src="img/geoms-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% # `ggplot2` template ```r <DATA> %>% ggplot(mapping = aes(<MAPPINGS>)) + <GEOM_FUNCTION>() ``` -- #### We can add more aesthetics *inside* geoms ```r <DATA> %>% ggplot(mapping = aes(<MAPPINGS>)) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) ``` -- #### And we can add *more* geoms ```r <DATA> %>% ggplot(mapping = aes(<MAPPINGS>)) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) + <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) ``` --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 60% background-size: 30% # Graph Aesthetics --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 7% # Aesthetics **Is the relationship between `weight` and `height` the same for both `gender`s?** *We can explore this by mapping the variables to different aesthetics* -- ### Aesthetics as graph elements *Examples of aesthetics are `color`, `size`, `shape`, and `alpha`* -- <img src="img/graph-elements.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 60% background-size: 30% # Mapping (global vs. local) --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 7% # Global `ggplot2` mapping ### ***inside the `ggplot()` function*** = setting variables ***globally*** <img src="img/ggplot2-template-01.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 7% # Local `ggplot2` mapping ### ***inside the `geom()` function*** = setting variables ***locally*** <img src="img/ggplot2-template-02.png" width="100%" height="100%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% # `ggplot2` tip ## Note the different syntax: -- ### *Initializing plots with `%>%`* ```r *<DATA> %>% # initialize ``` -- ### *Adding layers with `+`* ```r <DATA> %>% * ggplot(mapping = aes(<MAPPINGS>)) + # adding layers <GEOM_FUNCTION>() ``` --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ## Your Turn ### Set local aesthetic mappings *From here...* ```r SmallNhanes %>% * ggplot(mapping = aes(x = weight, y = height)) + geom_point() + geom_smooth() ``` -- *...to here.* ```r SmallNhanes %>% ggplot() + * geom_point(mapping = aes(x = weight, y = height)) * geom_smooth(mapping = aes(x = weight, y = height)) ``` ### What do you expect to see? --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ### Your Turn (solution 1) ```r SmallNhanes %>% * ggplot(mapping = aes(x = weight, y = height)) + geom_point() + geom_smooth() ``` <img src="img/aes-in-ggplot2-sol-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ### Your Turn (solution 2) ```r SmallNhanes %>% ggplot() + * geom_point(mapping = aes(x = weight, y = height)) + * geom_smooth(mapping = aes(x = weight, y = height)) ``` <img src="img/aes-in-geom-sol-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 60% background-size: 30% # Variables, Aestheitcs, and Geoms --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ### Variables, Aestheitcs, and Geoms (1) Each graph needs a variable or value, an aesthetic, and geom (the accompanying graphic, geometry) -- ```r *geom_point(mapping = aes(x = weight, y = height)) + # layer 1 *geom_smooth(mapping = aes(x = weight, y = height)) # layer 2 ``` -- | variable | aesthetic | geom | |:---------:|:-------------:|:----------------:| | `weight` | position = `x`| dots = `point` | | `height` | position = `y`| dots = `point` | | `weight` | position = `x`| line = `smooth` | | `height` | position = `y`| line = `smooth` | -- These have the same aesthetics! What if we added a layer with a variable mapped to a different aesthetic? --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ### Variables, Aestheitcs, and Geoms (2) But we can add *more* variables, map them to *different* aesthetics, and *adding* another `geom` layer -- Add another layer, coloring the points by `gender` ```r SmallNhanes %>% ggplot() + * geom_point(mapping = aes(x = weight, y = height)) + * geom_point(mapping = aes(color = gender)) ``` -- | variable | aesthetic | geom | |:---------:|:--------------:|:----------------:| | `weight` | position = `x` | dots = `point` | | `height` | position = `y` | dots = `point` | | `gender` | color = `color`| dots = `point` | --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ### Variables, Aestheitcs, and Geoms (3) ### ERROR! ```r SmallNhanes %>% ggplot() + geom_point(mapping = aes(x = weight, y = height)) + geom_point(mapping = aes(color = gender)) *# <error/rlang_error> *# geom_point requires the following missing aesthetics: x and y ``` -- ### SOLUTION All `geom`s have required aesthetics--map variables globally ```r SmallNhanes %>% * ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = gender)) ``` --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Aesthetics: color ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = gender)) ``` <img src="img/color-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Aesthetics: size ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = gender, size = gender)) ``` <img src="img/size-point-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Aesthetics: shape ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = gender, size = gender, shape = gender)) ``` <img src="img/shape-point-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Aesthetics: alpha (opacity) ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = gender, alpha = gender)) ``` <img src="img/alpha-point-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% # Aesthetic mappings <br> -- .pull-left[ ### Legend is automatically included ### Continuous variables best with `size` ] -- .pull-right[ <img src="img/aes-settings.png" width="100%" height="100%" style="display: block; margin: auto;" /> ] --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ## Setting values vs. mapping variables ### How can we create this plot? <img src="img/red-points-1-1.png" width="75%" height="75%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Inside `aes()` ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = "red")) # inside aes ``` <img src="img/inside-aes-no-eval-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ### Outside `aes()` ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(color = "red") # outside aes ``` <img src="img/red-points-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ## What happened? This expected a variable, not a value (`"red"`). ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + * geom_point(aes(color = "red")) # "value" in aes ``` -- <img src="img/inside-aes-no-eval-2-1.png" width="70%" height="70%" style="display: block; margin: auto;" /> --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 60% background-size: 30% # Geoms (geometric objects) --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 8% ## Geoms -- ### These are visual elements used to represent the data of the graph -- ### Examples include: - `geom_boxplot` - `geom_col` - `geom_line` - `geom_smooth` -- ### See the cheatsheet for more examples: https://bit.ly/ggplot2-cheat --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ## Your Turn -- ### *How does BMI vary across levels of self-reported general health?* -- ### Complete the code below: Map the variables locally inside the `geom_boxplot()` function ```r SmallNhanes %>% ggplot() %>% geom_boxplot(mapping = aes(x = __________, y = ___)) ``` --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ```r SmallNhanes %>% ggplot() + * geom_boxplot(mapping = aes(x = health_gen, y = bmi)) ``` -- <img src="img/box-plot-show-1-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> #### Box-plots are great for seeing how a continuous variable varies across a categorical variable --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% ## Your Turn <br> -- ### Fill in the code below to change the colors in the boxplot for each level of `health_gen` ```r SmallNhanes %>% ggplot() + geom_boxplot( * aes(x = health_gen, y = bmi, _____ = health_gen)) ``` --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ```r SmallNhanes %>% ggplot() + geom_boxplot( * aes(x = health_gen, y = bmi, color = health_gen)) ``` -- <img src="img/box-plot-color-1-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> *Color is not the setting we want here...* --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 4% ```r SmallNhanes %>% ggplot() + geom_boxplot( * aes(x = health_gen, y = bmi, fill = health_gen)) ``` -- <img src="img/box-plot-fill-1-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> *Fill is better* --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 8% # Adding layers -- ### The 'infinitely extensible' part of `ggplot2` is where we start to really see it's power -- ### Consider the relationship between `height` and `weight` again --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 7% ```r SmallNhanes %>% * ggplot(aes(x = weight, y = height)) + # global geom_point(aes(color = gender)) ``` -- <img src="img/layer-1-plot-1-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 7% ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + * geom_smooth(data = # data 2 * filter(SmallNhanes, gender == "male"), # layer 2 aes(x = weight, y = height), color = "blue") ``` -- <img src="img/layer-2-plot-1-1.png" width="77%" height="77%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 93% background-size: 7% ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + geom_smooth(data = filter(SmallNhanes, gender == "male"), aes(x = weight, y = height), color = "blue") + * geom_smooth(data = # data 3 * filter(SmallNhanes, gender == "female"), # layer 3 aes(x = weight, y = height), color = "red") ``` -- <img src="img/layer-3-plot-1-1.png" width="63%" height="36%" style="display: block; margin: auto;" /> --- class: inverse, center background-image: url(img/ggplot2.png) background-position: 50% 60% background-size: 30% # Facets --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 9% ## Facetting ### Facet layers display subplots for levels of categorical variables <br> | Facet layer | Display | |:-----------------------------|:-------------------------------------------| | `facet_wrap(. ~ gender)` | Plot for each level of `gender` | | `facet_wrap(race1 ~ gender)` | Plot for each level of `gender` and `race` | | `facet_wrap(. ~ gender, ncol = 1)` | Specify the number of columns | | `facet_wrap(. ~ gender, nrow = 1)` | Specify the number of rows | --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 9% ## Facet Single Variable ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + * facet_wrap(. ~ gender) ``` <img src="img/facet_wrap-1-1.png" width="72%" height="72%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 9% ## Facet Two Variables ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + * facet_wrap(race1 ~ gender) ``` <img src="img/facet_wrap-2vars-1-1.png" width="72%" height="72%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 9% ## Facet: Set Columns ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + * facet_wrap(race1 ~ gender, ncol = 5) ``` <img src="img/facet_wrap-cols-1-1.png" width="72%" height="72%" style="display: block; margin: auto;" /> --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 9% ## Facet: Set Rows ```r SmallNhanes %>% ggplot(aes(x = weight, y = height)) + geom_point(aes(color = gender)) + * facet_wrap(race1 ~ gender, nrow = 2) ``` <img src="img/facet_wrap-rows-1-1.png" width="72%" height="72%" style="display: block; margin: auto;" /> --- class: left, top background-image: url(img/ggplot2.png) background-position: 90% 7% background-size: 8% # Recap ### 1) Introduction the grammar of graphics syntax ### 2) Identifying graph aesthetics (position, color, shape, opacity, etc.) ### 3) Recognizing and using `geoms` (`geom_point`, `geom_smooth`, etc.) ### 4) Facetting graphs (`facet_wrap` with 1 or two variables) --- background-image: url(img/ggplot2.png) background-position: 95% 7% background-size: 12% # More resources ## The [`ggplot2` book](https://ggplot2-book.org/) ## [`ggplot2`](https://ggplot2.tidyverse.org/) on the tidyverse website ## [Flowing Data](https://flowingdata.com/)