Under construction
Welcome! These first lessons will help you differentiate between different file types, absolute vs. relative folder paths, and installing R and RStudio:
This week we will cover the RStudio IDE. Most of your work in the course will be done in this environment, so it’s important to know to navigate the different panes and features.
Under construction
I’ve written a small R package, goodenuffR
, to help organize your code, data, and outputs. It’s based off of the excellent paper, Good enough practices in scientific computing,
goodenuffR
an introduction to the goodenuffr
packageNow that we’ve covered file organization and the RStudio IDE, we can move onto some R programming. These exercises cover basic R structures (vectors and functions), and help you understand how to extract pieces of your data in R.
These lessons cover how to use the R Markdown, a “notebook interface to weave together narrative text and code to produce elegantly formatted output.”
So far, we’ve covered some basic R syntax. You’ve seen common R data objects (vectors, matrices, arrays, data frames, tibbles, and lists) and functions (str()
, typeof()
, class()
, etc.).
The tidyverse
is an “opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.” Ninety percent of the code you’ll see in this course uses the tidyverse
philosophy, which makes it easier to write (and read).
Now that we’ve introduced the tidyverse (and the %>%
), we’re going to extend this knowledge to ggplot2
, a popular data visualization package.
Data are rarely collected in a way that’s fit for analysis or data visualization. In this lesson we introduce dplyr
, a package designed to handle “most common data manipulation challenges.”
This lesson covers intermediate data visualization techniques like plotting trends (geom_line()
), adding help-text to your graph (ggplot2::annotate()
), labeling values to stand out (ggrepel::geom_label_repel()
), and using reference lines for comparisons (ggplot2::geom_hline()
).
tidyr
has functions for generate sequences of data (full_seq()
), creating an output for each combination of any input (expand_grid()
), and completing missing combinations of data (complete()
).
This lesson builds on our previous knowledge of ggplot2
and introduces facets (small multiples) with ggplot2::facet_wrap()
, ggforce::facet_wrap_paginate()
, and geofacet::facet_geo()
Now that we’ve covered how to manipulate and restructure data, we’ll use R’s functions for creating sequences (seq()
), along with some of tidyr
’s functions for dealing with missing or incomplete data (full_seq()
, expand_grid()
, and complete()
).
Maps are a great way to give context to any data source. In this lesson, we’re going to collect data with the rtweet
package and use it to build a map with ggplot2
.
Under construction
Not every project requires a data visualization, so this week we’ll cover how to display data in tables.
purrr
magick
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.