1 Texas death row executed offenders website

This continues with the Texas Department of Criminal Justice data, which keeps records of every inmate executed.

1.1 The data

These data are imported from the .Rmd we used to scrape the website. These data are in the folder below.

fs::dir_tree("../data/wk10-dont-mess-with-texas/")
## ../data/wk10-dont-mess-with-texas/
## ├── 2021-11-21-ExecutedOffenders.csv
## ├── 2021-11-30-ExecutedOffenders.csv
## └── processed
##     ├── 2021-11-21
##     │   ├── 2021-11-21-ExExOffndrshtml.csv
##     │   ├── 2021-11-21-ExExOffndrsjpg.csv
##     │   └── ExOffndrsComplete.csv
##     └── 2021-11-30
##         └── ExOffndrsComplete.csv

This will import the most recent data.

# fs::dir_ls("data/processed/2021-10-25")
ExecOffenders <- readr::read_csv("https://bit.ly/2Z7pKTI")
ExOffndrsComplete <- readr::read_csv("https://bit.ly/3oLZdEm")

1.2 Use purrr and dplyr to split and export .csv files

This next use of purrr and iteration will cover how to:

  1. Split the ExecOffenders data frame into ExExOffndrshtml and ExExOffndrsjpg

  2. Save each of these data frames as .csv files

We should have two datasets with the following counts.

ExecOffenders %>% 
  dplyr::count(jpg_html, sort = TRUE)

These are new experimental functions from dplyr, and a big shout out to Luis Verde Arregoitia for his post on a similar topic.

The dplyr::group_split() “returns a list of tibbles. Each tibble contains the rows of .tbl for the associated group and all the columns, including the grouping variables”, and I combine it with purrr::walk() and readr::write_csv() to export each file.

ExecOffenders %>% 
  dplyr::group_by(jpg_html) %>% 
  dplyr::group_split() %>% 
  purrr::walk(~.x %>% # we now carry this little .x everywhere we want it 
                      # to go.
                write_csv(path = paste0("../data/", 
                            # processed data folder
                            "wk10-dont-mess-with-texas/processed/",
                            # datestamp
                            base::noquote(lubridate::today()),
                            # folder
                            "/",
                            # datestamp
                            base::noquote(lubridate::today()),
                            # name of file
                            "-ExExOffndrs",
                            # split by this variable
                            base::unique(.x$jpg_html), 
                            # file extension
                            ".csv")))
fs::dir_tree("../data/wk10-dont-mess-with-texas/processed/")
## ../data/wk10-dont-mess-with-texas/processed/
## ├── 2021-11-21
## │   ├── 2021-11-21-ExExOffndrshtml.csv
## │   ├── 2021-11-21-ExExOffndrsjpg.csv
## │   └── ExOffndrsComplete.csv
## └── 2021-11-30
##     ├── 2021-11-30-ExExOffndrshtml.csv
##     ├── 2021-11-30-ExExOffndrsjpg.csv
##     └── ExOffndrsComplete.csv

1.2.1 End