Batch-rename files
batch-rename.Rmd
This vignette covers the ger_batch_rename()
–a function
that gives you the ability to rename a folder of files in standardized
format.
Test files
A folder of filenames
are available in the
gerp
package to demonstrate (below we’ll copy them to a
temporary folder)
tmp_dir <- tempdir()
tmp_files <- fs::as_fs_path(paste0(tmp_dir, "/", "filenames"))
# copy
fs::dir_copy(
path = system.file("filenames", package = "gerp"),
new_path = tmp_files)
Confirm this with gerp::ger_path()
ger_batch_rename()
The files in the temporary filenames/
folder are a mess,
and I want to rename them using a standardized
format. Renaming files manually can be time-consuming, so I’ve
written a handy function that will get you 90% there (and pull all the
files in a standardized format):
all file names are converted to lower snakecase
all punctuation and special characters have been removed from file names
file names have a date prefix (the
birth
date, but other options include"modification"
,"access"
, or"change"
)
I’ll demonstrate below:
gerp::ger_batch_rename(path = tmp_files, prefix = "birth")
And confirm with gerp::ger_path()
# confirm
gerp::ger_path(
path = paste0(tmp_dir, "/", "filenames"),
tree = TRUE)
#> /tmp/RtmpRYjGHj/filenames
#> ├── 2023-04-10_fig-2.png
#> ├── 2023-04-10_figure-1.png
#> ├── 2023-04-10_joe-s-filenames-use-space-and-punctuation.xlsx
#> ├── 2023-04-10_jw-7-d-2-sl-deletethisandyourcareerisover-wx-2.txt
#> └── 2023-04-10_myabstract.docx
The following sections of this vignette walk through each step to
give a better picture of how ger_batch_rename()
works.
Step by step
I’ll create another tmp_files
path with the original
(messy) file names:
tmp_dir <- tempdir()
tmp_files <- fs::as_fs_path(paste0(tmp_dir, "/", "filenames"))
# copy
fs::dir_copy(
path = system.file("filenames", package = "gerp"),
new_path = tmp_files)
And confirm this with fs::dir_tree()
# confirm
fs::dir_tree(tmp_dir)
#> /tmp/RtmpRYjGHj
#> ├── file19b7185ace52
#> └── filenames
#> ├── JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt
#> ├── Joe's Filenames Use Space and Punctuation.xlsx
#> ├── fig 2.png
#> ├── figure 1.png
#> └── myabstract.docx
Get tibble of file names
get_files_tbl()
creates a tibble of file paths, names,
and date variables:
files_tmp <- get_files_tbl(path = tmp_files)
glimpse(files_tmp)
#> Rows: 5
#> Columns: 9
#> $ file_path <chr> "/tmp/RtmpRYjGHj/filenames/JW7d^(2sl@deletethisandyourc…
#> $ file_folder <chr> "/tmp/RtmpRYjGHj/filenames", "/tmp/RtmpRYjGHj/filenames…
#> $ file_full_name <chr> "JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt", "Joe…
#> $ file_extension <chr> ".txt", ".xlsx", ".png", ".png", ".docx"
#> $ file_name <chr> "JW7d^(2sl@deletethisandyourcareerisoverWx2*", "Joe's F…
#> $ modification <date> 2023-04-10, 2023-04-10, 2023-04-10, 2023-04-10, 2023-04…
#> $ access <date> 2023-04-10, 2023-04-10, 2023-04-10, 2023-04-10, 2023-0…
#> $ change <date> 2023-04-10, 2023-04-10, 2023-04-10, 2023-04-10, 2023-0…
#> $ birth <date> 2023-04-10, 2023-04-10, 2023-04-10, 2023-04-10, 2023-0…
files_tmp$file_full_name[1]
#> [1] "JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt"
Date prefix
Decide which date to use ("modification"
,
"access"
, "change"
, or
"birth"
)
The date_prefix
column will be added to the cleaned
column names.
file_date_tmp <- get_date_prefix(files = files_tmp)
glimpse(file_date_tmp)
#> Rows: 5
#> Columns: 6
#> $ file_path <chr> "/tmp/RtmpRYjGHj/filenames/JW7d^(2sl@deletethisandyourc…
#> $ file_folder <chr> "/tmp/RtmpRYjGHj/filenames", "/tmp/RtmpRYjGHj/filenames…
#> $ file_full_name <chr> "JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt", "Joe…
#> $ file_extension <chr> ".txt", ".xlsx", ".png", ".png", ".docx"
#> $ file_name <chr> "JW7d^(2sl@deletethisandyourcareerisoverWx2*", "Joe's F…
#> $ date_prefix <chr> "2023-04-10_", "2023-04-10_", "2023-04-10_", "2023-04-1…
file_date_tmp$date_prefix[1]
#> [1] "2023-04-10_"
Clean file name
The clean_file_name
has been formatted to include the
date_prefix
, lower-snakecase, and all punctuation/special
characters have been removed.
clean_tbl <- get_clean_file_names(file_date_tmp, file_name)
glimpse(clean_tbl)
#> Rows: 5
#> Columns: 4
#> $ clean_file_name <chr> "2023-04-10_jw-7-d-2-sl-deletethisandyourcareerisover-…
#> $ clean_file_path <chr> "/tmp/RtmpRYjGHj/filenames/2023-04-10_jw-7-d-2-sl-dele…
#> $ file_full_name <chr> "JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt", "Jo…
#> $ file_path <chr> "/tmp/RtmpRYjGHj/filenames/JW7d^(2sl@deletethisandyour…
clean_tbl$clean_file_name[1]
#> [1] "2023-04-10_jw-7-d-2-sl-deletethisandyourcareerisover-wx-2.txt"
clean_tbl$file_full_name[1]
#> [1] "JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt"
NOTE: ger_batch_rename()
can’t change
the content of the file names (and you
wouldn’t want it to).
Rename all files
Now we place the new and old names in their own vectors and pass them
to rename_all_files()
:
new_names <- clean_tbl[['clean_file_path']]
old_names <- clean_tbl[['file_path']]
rename_all_files(old = old_names, new = new_names)
We can confirm a final time with gerp::ger_path()
:
gerp::ger_path(
path = paste0(tmp_dir, "/", "filenames"),
tree = TRUE)
#> /tmp/RtmpRYjGHj/filenames
#> ├── 2023-04-10_fig-2.png
#> ├── 2023-04-10_figure-1.png
#> ├── 2023-04-10_joe-s-filenames-use-space-and-punctuation.xlsx
#> ├── 2023-04-10_jw-7-d-2-sl-deletethisandyourcareerisover-wx-2.txt
#> └── 2023-04-10_myabstract.docx