Motivation
The goal of the dfdiffs
is to answer the following
questions:
- What rows are here now that weren’t here before?
- What rows were here before that aren’t here now?
- What values have been changed?
Test data
These are two Masters tables from the Lahman
baseball database.
m15 <- dfdiffs::master15 |>
dplyr::slice_sample(n = 3000, replace = FALSE)
max(m15$debut, na.rm = TRUE)
#> [1] "2015-09-27"
m20 <- dfdiffs::master20 |>
dplyr::slice_sample(n = 3000, replace = FALSE)
max(m20$debut, na.rm = TRUE)
#> [1] "2019-09-09"
The compare_data()
function
compare_data(compare = , base = , by = , by_col = , cols = )
comparisons <- compare_data(
compare = m20, base = m15,
by = "playerID", by_col = "join",
cols = c("nameFirst", "nameLast", "nameGiven", "height"))
names(comparisons)
#> [1] "new_data" "deleted_data" "changed_num_diffs"
#> [4] "changed_var_diffs"
$new_data
head(comparisons$new_data)
join
|
nameFirst
|
nameLast
|
nameGiven
|
height
|
burnspe01
|
Pete
|
Burnside
|
Peter Willits
|
74
|
taylodu01
|
Dummy
|
Taylor
|
Luther Haden
|
73
|
mechegi01
|
Gil
|
Meche
|
Gilbert Allen
|
75
|
wagnele01
|
Leon
|
Wagner
|
Leon Lamar
|
73
|
sturtta01
|
Tanyon
|
Sturtze
|
Tanyon James
|
77
|
stewafr01
|
Frank
|
Stewart
|
Frank
|
73
|
$deleted_data
head(comparisons$deleted_data)
join
|
nameFirst
|
nameLast
|
nameGiven
|
height
|
doylede01
|
Denny
|
Doyle
|
Robert Dennis
|
69
|
saundde01
|
Dennis
|
Saunders
|
Dennis James
|
75
|
baylodo01
|
Don
|
Baylor
|
Don Edward
|
73
|
bradyst01
|
Steve
|
Brady
|
Stephen A.
|
69
|
jordasl01
|
Slats
|
Jordan
|
Clarence Veasey
|
73
|
crozier01
|
Eric
|
Crozier
|
Eric Le Roi
|
76
|
$changed_num_diffs
comparisons$changed_num_diffs
comparisons$changed_num_diffs
variable
|
no_of_differences
|
nameFirst
|
3
|
nameGiven
|
5
|
height
|
7
|
$changed_var_diffs
comparisons$changed_var_diffs
comparisons$changed_var_diffs
variable
|
join
|
base
|
compare
|
nameFirst
|
couloda01
|
Daniel
|
Danny
|
nameFirst
|
dorseje01
|
Jerry
|
Joseph
|
nameFirst
|
reynoch01
|
Charlie
|
Thomas
|
nameGiven
|
davisju01
|
James J.
|
James Joseph
|
nameGiven
|
dorseje01
|
Michael Jeremiah
|
Joseph Wilbur
|
nameGiven
|
jonesad01
|
Adam La Marque
|
Adam LaMarque
|
nameGiven
|
reynoch01
|
Charles E.
|
Thomas Hart
|
nameGiven
|
zimmejo02
|
Jordan M.
|
Jordan Michael
|
height
|
craigge01
|
NA
|
71
|
height
|
feldmsc01
|
79
|
78
|
height
|
maddefr01
|
NA
|
68
|
height
|
novaiv01
|
76
|
77
|
height
|
pressry01
|
75
|
74
|
height
|
thompta01
|
77
|
76
|
height
|
uptonju01
|
74
|
73
|