Lab 5: Deployments and Code Promotion

Published

2026-07-11

Comprehension questions

The questions below come from the Deployments and Code Promotion chapter.

Data Science Environments

Write down a mental map of the relationship between the three environments for data science. Include the following terms: Git Promote, CI/CD, automation, deployment, dev, test, prod.

The three environments for data science are: development, test, and production (not to be confused with the three layers of data science environments¹).

The figures from the chapter do an excellent job illustrating these environments, but I’ve added a mindmap:

%%{init: {'theme': 'base', 'themeVariables': {'fontFamily': 'monospace'}}}%%
mindmap
  root((Code<br>Promotion))
    Dev
      Code Writing
      Experimentation
      Git Promote
        Push to feature<br>branch
        Pull request<br>triggers CI/CD
    Test
      Automation
        Unit tests
        Integration<br>tests
        UAT
      CI/CD
        Validates<br>build
        Runs test<br>suite
    Prod
      Deployment
        Automated<br>release
        Version<br>tagged
      Monitoring
        Stability<br>checks
        Access control

Data Science Environments

Development Environments

The development environment (dev) is the primary location where data scientists write new code. It’s typically used for iteration and experimentation, and it’s connected to version control (Git).

Test Environments

The testing environment (test) is a mirror of the production environment with realistic data (often anonymized). Test environments are designed to check integration and validation with other systems. The test environment also typically includes user acceptance testing and/or system tests.

Production Environments

The ‘live’ environment that serves the end users is the production environment (prod). The production branch has strictly controlled access with a focus on stability and monitoring.

Git & Data Science

Why is Git so important to a good code promotion strategy?

Code promotion is the process of tagging Git commits as ‘ready for production.’ This creates a reference point that can trigger tests (in the test branch) or a production deployment pipeline (in the prod branch). Without version control, untracked changes have no canonical record. Git and code promotion together ensure consistency across all three environments (dev, test, and prod) and reduce human error in deployment.

Artifact immutability

With Git, all changes are signed via specific commit SHAs or tagged releases. Before any changes can be merged into a prod branch, approval is required (the branch is protected). This artifact immutability connects builds as they are promoted across environments, ensuring untested or unreviewed code never reaches production.

Environment drift

Environment drift is when the production code differs from the tested code. Without a code promotion strategy and version control, the code would need to be rebuilt in each environment, so the tested code wouldn’t necessarily be the code deployed into production. In DevOps, environment drift is the enemy because it can lead to unpredictable behavior and makes debugging difficult.

Environment parity

Git enables environment parity. When used properly, our three environments (dev, test, and prod) differ only by data and credentials. Changes are promoted, not re-created. When production code breaks, developers can use Git to roll back to a previous version (via commit SHAs or tagged releases), avoiding ad hoc changes directly in prod.

Can you have a code promotion strategy without Git?

Code promotion can be done without Git, but you would need to find a way to:

Create an ‘immutable versioned artifact’ (some way of tracking who did what, and when)
Set up blocks and gates requiring tests and approvals before code is promoted
Build a promotion mechanism to copy, move, or retag each versioned artifact between environments

You would also need substitutes for:

A way to view and review diffs, either through code review or document-style review
The ability to keep previous artifacts and switch between environments
Some means of traceability connecting artifacts, manifests, approvals, and build logs

Git & CI/CD

What is the relationship between Git and CI/CD?

A CI/CD pipeline is an automated process that builds, tests, and deploys code. In CI/CD systems, Git events define the code promotion workflow, not the pipeline itself. The pipeline simply reacts to Git events, while the promotion logic lives in the branches, tags, and merge policies. This enables a “build once, promote many times” strategy where the same tested artifact moves across environments.

Git provides the code change and approval history, with pipeline results tied to specific commits. Each release is tagged, making it straightforward to track what changed and when. That traceability is critical for debugging and accountability. Git serves as the gatekeeper for CI/CD policies and integrates directly with enforcement mechanisms.

Using Git turns CI/CD into a system, not a script.

%%{init: {'theme': 'base', 'themeVariables': {'fontFamily': 'monospace'}}}%%
flowchart LR
    GitEv("Git Event<br>(push, PR, merge)") -->|"triggers"| CICD("CI/CD Pipeline")
    CICD --> CI("Continuous<br>Integration (CI)<br><em>Build & Test</em>")
    CICD --> CD("Continuous<br>Delivery (CD)<br><em>Deploy</em>")
    style GitEv fill:#5B8C5A,stroke:#000000,stroke-width:1px,color:#ffffff
    style CICD fill:#D2562B,stroke:#000000,stroke-width:1px,color:#ffffff
    style CI fill:#2A6F77,stroke:#000000,stroke-width:1px,color:#ffffff
    style CD fill:#2A6F77,stroke:#000000,stroke-width:1px,color:#ffffff

Git & CI/CD

Think of Git as the source of truth and CI/CD as an automated response to changes in that truth. Git events (pushes, pull requests, merges) can fire a CI/CD pipeline.

Key distinctions between Git and CI/CD:

	Git	CI/CD
Role	Tracks & versions code	Automates build, test, deploy
Trigger	Human action (commit, merge)	Git events
Output	A versioned history	A tested, deployed artifact

What’s the benefit of using Git and CI/CD together?

Without Git, CI/CD has no reliable source to build from. Without CI/CD, Git changes require manual promotion between environments, which is slow and error-prone.

%%{init: {'theme': 'base', 'themeVariables': {'fontFamily': 'monospace'}}}%%

gitGraph
   commit id: "Initial codebase"

   branch dev
   checkout dev
   commit id: "Experiment: model v1"
   commit id: "Iterate: model v2"
   commit id: "Iterate: model v3"

   branch test
   checkout test
   commit id: "Deploy to Test env"
   commit id: "Integration checks"
   commit id: "System tests"
   commit id: "UAT approved"

   checkout main
   merge test id: "Merge to main"
   commit id: "Tag release v1.0"

   branch prod
   checkout prod
   commit id: "Deploy to Prod"
   commit id: "Monitoring active"

Git & CI/CD

In the gitGraph above:

dev is where all experimentation happens and never touches main directly
test branches from dev once code is stable enough to validate, keeping conditions isolated from prod
main only receives code that has passed full UAT and system testing, acting as the gatekeeper
prod branches from main after a release is tagged, representing exactly what is live at any given time and making rollbacks straightforward

Git, GitHub, CI/CD, Actions, & Version Control

Write out a mental map of the relationship of the following terms: Git, GitHub, CI/CD, GitHub Actions, Version Control.

%%{init: {'theme': 'base', 'themeVariables': {'fontFamily': 'monospace'}}}%%

mindmap
  root((Code<br>Management))
    <strong>Version Control</strong>
      <em>Tracks changes<br>over time</em>
      <em>Enables<br>rollback</em>
      <em>Supports<br>collaboration</em>
        <strong>Git</strong>
          <em>Local<br>repositories</em>
          <em>Branching<br>and merging</em>
          <em>Commit history</em>
            <strong>GitHub</strong>
              <em>Remote<br>repositories</em>
              <em>Pull<br>requests</em>
              <em>Code<br>review</em>
                <strong>GitHub Actions</strong>
                  <em>Workflow<br>files</em>
                  <em>Triggered by<br>Git events</em>
                    <strong>CICD</strong>
                      <em>Continuous<br>Integration</em>
                        <em>Automated<br>testing</em>
                        <em>Build<br>validation</em>
                      <em>Continuous<br>Delivery</em>
                        <em>Automated<br>deployment</em>
                        <em>Environment<br>promotion</em>

Git, GitHub, CI/CD, Actions, & Version Control

See Environments Have Layers in the do4ds book.↩︎