Lab 5: Deployments and Code Promotion

Published

2026-01-12

Caution

This section is being revised. Thank you for your patience.

Comprehension questions

The questions below come from the Deployments and Code Promotion chapter.

Data Science Environments

Write down a mental map of the relationship between the three environments for data science. Include the following terms: Git Promote, CI/CD, automation, deployment, dev, test, prod.

The figures from the chapter do an excellent job illustrating these environments, but I’ve added a sequence diagram:

%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace', "fontSize":"18px"}}}%%


sequenceDiagram
    participant Dev as Dev
    participant Git as Git
    participant CI as CI/CD<br>Pipeline
    participant Test as Test
    participant Prod as Prod
    
    Dev->>Dev: Write code & test locally
    Dev->>Git: Commit & push changes
    Git->>CI: Trigger automation
    CI->>CI: Run tests & checks
    CI->>Test: Deploy to Test environment
    Test->>Test: Integration testing
    Test->>Test: User acceptance testing
    
    Note over Test,CI: Manual approval or<br/>automated promotion
    
    Test->>Git: Tag release (Git Promote)
    Git->>CI: Trigger production deployment
    CI->>CI: Build & validate
    CI->>Prod: Deploy to Production
    Prod->>Prod: Monitor & validate
    
    Note over Prod: Rollback<br>if needed

Data Science Environments

DEV Environments

The development environment (Dev) is the primary location where data scientists write and test their code. It’s typically used for iteration and experimentation, and it’s connected to version control (Git).

Test Environments

The testing environment (Test) is a mirror of the production environment with realistic data (often anonymized). Test environments are integrated and validates with other systems, and user acceptance testing occurs here.

Production Environments

The ‘live’ environment that serves the end users is the production environment (Prod). Prod has strictly controlled access with a focus on stability and monitoring.

%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace', "fontSize":"18px"}}}%%

flowchart TB
    Dev[Development Environment]
    Git[Git Repository]
    CI[CI: Continuous Integration]
    Test[Test Environment]
    Promote[Git Promote]
    CD[CD: Continuous Deployment]
    Prod[Production Environment]
    
    Dev -->|commit & push| Git
    Git -->|webhook triggers| CI
    CI -->|tests pass| Test
    
    Test -->|validation complete| Promote
    Promote -->|create tag v1.0| Git
    Git -->|tag triggers| CD
    CD -->|deploy| Prod
    
    Prod -.->|rollback needed| Git
    Git -.->|deploy previous tag| CD
    
    style Dev fill:#e3f2fd,stroke:#1976d2,stroke-width:3px
    style Test fill:#fff3e0,stroke:#f57c00,stroke-width:3px
    style Prod fill:#e8f5e9,stroke:#388e3c,stroke-width:3px
    style Git fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    style CI fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style CD fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style Promote fill:#fff9c4,stroke:#f9a825,stroke-width:3px

Data Science Environments

Git promotion is the process of tagging a commit as ready for production. This creates an immutable reference point and triggers production deployment pipeline. The CI/CD Pipeline is the automated process that builds, tests, and deploys the code.

This ensures consistency across environments and reduces human error in deployment. CI/CD automates testing at each stage, deployment between environments, rollback on failures, and monitoring and alerting.

Git & Data Science

Why is Git so important to a good code promotion strategy?

There is a canonical DevOps rule of thumb,

If a change isn’t represented as a Git commit, it didn’t happen.

In mature CI-CD systems, Git defines the promotion workflow, not the pipeline. For example, if we have the following Git branching model:

feature: develop features and write unit tests
test: write integration tests
main: production deployment

Changes are committed to feature* → merged to test → merged to main

%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace', "fontSize":"18px"}}}%%

gitGraph
    commit id: "Initial commit"

    branch test
    checkout test
    commit id: "Set up tests"

    branch feature
    checkout feature
    commit id: "Develop feature"
    commit id: "Unit tests"

    checkout test
    merge feature id: "Merge feature"
    commit id: "Integration tests"

    checkout main
    merge test id: "Release"

Git branches

The pipeline simply reacts to Git events. The code promotion logic lives in the branches, tags, merge policies, and not in brittle pipeline scripts.

A core CI-CD principle:

Build once, promote many times.

All commits are signed–we know who made the changes–and the main branch requires approval to merge (i.e., it’s protected). This artifact immutability connects ‘builds’ to specific commit SHAs or by tagging releases (like v1.4.3).

The same code is promoted across environments so we can be sure untested code never reaches production.

Without Git, the code we need to be rebuilt in each environment. The tested code wouldn’t necessarily be the deployed code, and environment drift would be inevitable.

Git is the gatekeeper for CI-CD policies and integrates directly with enforcement mechanisms.

If it’s not enforced in Git, it will be bypassed

Git enables environment parity. In DevOps, environment drift is the enemy. Dev, test, and prod differ only by data and credentials, and changes are promoted, not re-created.

From a DevOps audit perspective, Git provides the code change/approval history, and the pipeline results tied to commits. Each release is also tagged. This makes it easy to answer the inevitable question,

“Which commit introduced this behavior in production?”

When production inevitably breaks, we can use Git to roll back by redeploying previous commits/tags. This avoids making ad-hoc changes in a production server.

Git promotion mechanics are standardized, so pipelines become reusable templates. This turns CI-CD into a system, not a script.

%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace', "fontSize":"18px"}}}%%

flowchart TB
  Feat("<code>feature/*</code>") -->|<em>Pull request</em>| Dev("<code>develop</code>")
  Dev -->|<em>Release pull request</em>| Main("<code>main</code>")
  Main -->|<em>Deploy</em>| Prod["<strong>production</strong>"]
  Main -->|<em>Promote</em>| St("<code>staging</code>")
  St -->|<em>Approve</em>| Prod

Typical CI-CD promotion pattern

Git is like a state machine, controlling the flow of changes.

With Git-centered promotion:

Git = intent
Pipeline = execution
Artifact = immutable
Promotion = merge or tag

Can you have a code promotion strategy without Git?

Git & CI/CD

What is the relationship between Git and CI/CD?

What’s the benefit of using Git and CI/CD together?

Git, GitHub, CI/CD, Actions, & Version Control

Write out a mental map of the relationship of the following terms: Git, GitHub, CI/CD, GitHub Actions, Version Control.