Directories

Published

2025-01-25

Caution

This section is being revised. Thank you for your patience.

When presented with a new map, the most important thing to find is your location on it (it’s hard to know where you’re going without knowing where you are). In Linux, directories are more than just containers for files; they are a critical part of the hierarchical file system, which is organized in a tree-like structure starting from the root directory (/).

Root Directory

The root directory is the topmost directory in the Linux file system hierarchy. All other directories and files are nested within it. It is represented by a single forward slash / and contains critical system directories like /bin, /etc, /home, and /var.

%%{init: {'theme': 'neutral', 'themeVariables': { 'fontFamily': 'monospace', "fontSize":"16px"}}}%%

flowchart TD
  Root(<code>/</code>)
  bin(<code>/bin</code>)
  etc(<code>/etc</code>)
  home(<code>/home</code>)
  usr(<code>/usr</code>)
  var(<code>/var</code>)
  tmp(<code>/tmp</code>)

  Root --> bin
  Root --> etc
  Root --> home
  Root --> usr
  Root --> var
  Root --> tmp

Everything is a file

Subdirectories

Beneath the root directory, the Linux file system is organized into various subdirectories, each serving specific purposes:

/bin – Contains essential user command binaries

/etc – Stores system-wide configuration files.

/home – Houses personal directories for each user.

/var – Contains variable data like logs and spool files.

In Linux systems, the ~ represents the user’s ‘home’ directory.

File paths

A file path is a character string specifying the unique location of a file or directory within the hierarchical file system. File paths can be absolute (starting from the root (/) directory) or relative (starting from the current (.) directory).

Navigate

pwd

pwd (print working directory) tells you exactly where you are in the filesystem.

pwd 
# /Users/username/projects/books/fm-linux

The output from pwd is the file path to our local working directory.

cd ~
pwd
# /Users/username

tree

The tree command will recursively print the directory structure of a file path in a ‘tree-like’ format, visually representing the hierarchy of files and directories.

Below is an example ‘folder tree’ for the current working directory returned from the pwd command above:

# /Users/ 
#   └─ username/ -> represented as '~'
#        └─ project/ 
#             └─ books/ 
#                 └─ fm-linux/

The output from pwd is an absolute file path, and absolute file paths do not change regardless of the current working directory.

cd

Move from one directory to another with cd (change directory). For example, cd data takes you to the data folder inside our current working directory.

cd data

In the command above, data is a relative file path. Relative file paths specify the location of a file relative to the current working directory.

If we view our current directory with tree after changing it to data, we see the current location listed as .:

cd data 
tree -d
# .
# └── raw
# 
# 2 directories

It’s important to notice the difference between absolute and relative paths, because it makes it easier to navigate the operating system and manipulate files and folders.

For example, can use a relative file path to view the report.txt file in the data folder with cat:

cd data
cat report.txt
# my important information

data/report.txt is the relative file path (i.e., relative to the current working directory), and it’s meaning is based on the directory from which it is referenced.

ls

ls (list) lists the files and folders in a given location. In /bin, ls would show you the software tools available:

cd /bin # change location
ls # what's in here?
# [
# bash
# cat
# chmod
# cp
# csh
# dash
# date
# dd
# df
# echo
# ed
# expr
# hostname
# kill
# ksh
# launchctl
# link
# ln
# ls
# mkdir
# mv
# pax
# ps
# pwd
# realpath
# rm
# rmdir
# sh
# sleep
# stty
# sync
# tcsh
# test
# unlink
# wait4path
# zsh

find

find can be used to locate files or directories using the -type and -name options. The example below looks in the current working directory (.) for a folder named data:

find . -type d -name data
# ./data

Manage

In the Linux world, file and directory management is a fundamental skill. This chapter introduces some common commands that will allow you to create, copy, move, remove, and link files and directories.

mkdir

mkdir (Make Directory) builds a new folder wherever you tell it to, like making a new folder in data for inputs (data/in) or outputs (data/out)

mkdir data/in
mkdir data/out

Confirm with tree -d (the -d is for directories):

tree data -d
# data
# ├── in
# └── out
# 
# 3 directories

cp

cp duplicates files or folders. The cp command is used to copy files or directories from one location to another. Imagine having a file (binary_data.tsv) on your root (.) directory that you want to copy to the /data/in folder; you could use cp to make a duplicate.

cp binary_data.tsv data/in/binary_data.tsv

Confirm with tree

tree data/in
# data/in
# └── binary_data.tsv
# 
# 1 directory, 1 file

mv

mv, short for move, moves files or directories from one location to another. We’ll use it to move data/binary_data.tsv to data/out/binary_data.tsv:

# move file
mv data/in/binary_data.tsv data/out/binary_data.tsv 

Confirm move with tree:1

tree data -P *.tsv
# data
# ├── in
# └── out
#     └── binary_data.tsv
# 
# 3 directories, 1 file

It can also be used for renaming files.

# rename file
mv data/out/binary_data.tsv  data/out/bin_dat.tsv 
# confirm rename 
tree data/out
# data/out
# └── bin_dat.tsv
# 
# 1 directory, 1 file

mv is especially useful for organizing files and directories that are in the wrong place.

rm

The rm command stands for remove and is used to delete files or directories.

# remove doc folder
rm data/out
# rm: data/out: is a directory

By default, it won’t remove a directory without the -R or -r option.

Warning

It’s important to note here that the command-line is not very forgiving. Using rm is a powerful action with significant consequences, as it permanently deletes files, akin to shredding documents. There’s usually no easy way to recover deleted files unless you have a backup.

Linux is like a chainsaw. Chainsaws are powerful tools, and make many difficult tasks like cutting through thick logs quite easy. Unfortunately, this power comes with danger: chainsaws can cut just as easily through your leg.’ - Gary Bernhardt2

# add option 
rm -R data/out

Recap

Given the principle that everything is a file in Linux, directories are treated as special types of files that contain references to other files and directories, facilitating the organization and management of data.

Understanding how to navigate, manipulate, and manage directories is essential for effective use of the Linux operating system.

See a typo, error, or something missing?

Please open an issue on GitHub.


  1. The -P *.tsv option for tree tells it to look in data for files or folders with a .tsv extension. We’ll cover wildcards and patterns in the Symbols & Patterns chapter.↩︎

  2. As quoted in Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools (2015) by Vince Buffalo.↩︎