This continues with the Texas Department of Criminal Justice data, which keeps records of every inmate executed.
These data are imported from the .Rmd we used to scrape the website. These data are in the folder below.
fs::dir_tree("../data/wk10-dont-mess-with-texas/")
## ../data/wk10-dont-mess-with-texas/
## ├── 2021-11-21-ExecutedOffenders.csv
## ├── 2021-11-30-ExecutedOffenders.csv
## └── processed
## ├── 2021-11-21
## │ ├── 2021-11-21-ExExOffndrshtml.csv
## │ ├── 2021-11-21-ExExOffndrsjpg.csv
## │ └── ExOffndrsComplete.csv
## └── 2021-11-30
## └── ExOffndrsComplete.csv
This will import the most recent data.
# fs::dir_ls("data/processed/2021-10-25")
ExecOffenders <- readr::read_csv("https://bit.ly/2Z7pKTI")
ExOffndrsComplete <- readr::read_csv("https://bit.ly/3oLZdEm")
purrr
and dplyr
to split and export .csv filesThis next use of purrr
and iteration will cover how to:
Split the ExecOffenders
data frame into ExExOffndrshtml
and ExExOffndrsjpg
Save each of these data frames as .csv files
We should have two datasets with the following counts.
ExecOffenders %>%
dplyr::count(jpg_html, sort = TRUE)
These are new experimental functions from dplyr
, and a big shout out to Luis Verde Arregoitia for his post on a similar topic.
The dplyr::group_split()
“returns a list of tibbles. Each tibble contains the rows of .tbl for the associated group and all the columns, including the grouping variables”, and I combine it with purrr::walk()
and readr::write_csv()
to export each file.
ExecOffenders %>%
dplyr::group_by(jpg_html) %>%
dplyr::group_split() %>%
purrr::walk(~.x %>% # we now carry this little .x everywhere we want it
# to go.
write_csv(path = paste0("../data/",
# processed data folder
"wk10-dont-mess-with-texas/processed/",
# datestamp
base::noquote(lubridate::today()),
# folder
"/",
# datestamp
base::noquote(lubridate::today()),
# name of file
"-ExExOffndrs",
# split by this variable
base::unique(.x$jpg_html),
# file extension
".csv")))
fs::dir_tree("../data/wk10-dont-mess-with-texas/processed/")
## ../data/wk10-dont-mess-with-texas/processed/
## ├── 2021-11-21
## │ ├── 2021-11-21-ExExOffndrshtml.csv
## │ ├── 2021-11-21-ExExOffndrsjpg.csv
## │ └── ExOffndrsComplete.csv
## └── 2021-11-30
## ├── 2021-11-30-ExExOffndrshtml.csv
## ├── 2021-11-30-ExExOffndrsjpg.csv
## └── ExOffndrsComplete.csv