fs::dir_tree(path = "_labs/lab4/")
## _labs/lab4/
## ├── app-log.R
## └── app-log.py- 1
-
R Shiny app
- 2
- Python Shiny app
The original files in do4ds/_labs/lab4 are below:
fs::dir_tree(path = "_labs/lab4/")
## _labs/lab4/
## ├── app-log.R
## └── app-log.pyBelow are my answers to the comprehension questions in this chapter.
What is the difference between monitoring and logging? What are the two halves of the monitoring and logging process?
Logging is a record of “What happened”–logs capture discrete events and detailed narratives of what occurred in the system at specific points in time.
Monitoring is usually a display of “How is it performing”–tracking quantitative metrics and system health indicators over time to understand trends and current state.
%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace'}}}%%
graph LR
subgraph "LOG: events"
L1(["Discrete<br>Events"])
L2(["Detailed<br>Context"])
L3(["Historical<br>Investigation"])
L4(["Text-Based<br>Messages"])
L5("<strong>When/What/Who/Why</strong>")
end
subgraph "MONITOR: metrics"
M1(["Quantitative<br>Metrics"])
M2(["Real-time<br>Status"])
M3(["Trend<br>Analysis"])
M4(["Numerical<br>Data"])
M5("<strong>How Much/How Fast/<br>How Often</strong>")
end
subgraph "Common Use Cases"
LC(["Root Cause<br>Analysis"])
LC2(["Debugging Specific<br>Issues"])
LC3(["Compliance<br>Auditing"])
LC4(["Security<br>Investigation"])
MC(["Performance<br>Optimization"])
MC2(["Capacity<br>Planning"])
MC3(["SLA Monitoring"])
MC4(["Real-time<br>Alerting"])
end
L1 --> LC
L2 --> LC2
L3 --> LC3
L4 --> LC4
M1 --> MC
M2 --> MC2
M3 --> MC3
M4 --> MC4
classDef logging fill:#d8e4ff,stroke:#333,stroke-width:2px,rx:6,ry:6,text-align:center,font-size:13px
classDef monitoring fill:#d8e4ff,stroke:#333,stroke-width:2px,rx:6,ry:6,text-align:center,font-size:13px
classDef usecase fill:#31e981,stroke:#333,stroke-width:1px,rx:6,ry:6,text-align:center,font-size:13px
class L1,L2,L3,L4 logging
class M1,M2,M3,M4 monitoring
class LC,LC2,LC3,LC4,MC,MC2,MC3,MC4 usecase
| Aspect | Logging | Monitoring |
|---|---|---|
| Data Type | Discrete events, text messages | Metrics, counters, gauges |
| When Used | After something happens | Continuously |
| Purpose | Debugging, investigation | Performance tracking, alerting |
| Format | Structured/unstructured text | Numerical time-series data |
| Storage | Log files, log databases | Time-series databases |
| Analysis | Search, filter, correlate | Aggregate, trend, threshold |
Logging is generally good, but what are some things you should be careful not to log?
Critical data we should make sure not to log includes:
Industry specific guidelines on logging are available below:
At what level would you log each of the following events:
The answers below are provided for the R logger package and Python’s logging facility.
In R, the logger package has the following levels:
%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace'}}}%%
graph TD
LogLevels(["Logging Levels<br/>Least -> Most Severe"]) --> TRACE("TRACE")
TRACE --> DEBUG("DEBUG")
DEBUG --> INFO("INFO")
INFO --> SUCCESS("SUCCESS")
SUCCESS --> WARN("WARN")
WARN --> ERROR("ERROR")
ERROR --> FATAL("FATAL")
TRACE -.Fine-grained details.-> TraceEx[Variable states<br/>Loop iterations]
DEBUG -.Diagnostic information.-> DebugEx[Function entry/exit<br/>Parameter values]
INFO -.Informational messages.-> InfoEx[Process started<br/>Configuration loaded]
SUCCESS -.Positive confirmations.-> SuccessEx[Task completed<br/>Data saved successfully]
WARN -.Potential issues.-> WarnEx[Deprecated functions<br/>Missing optional params]
ERROR -.Recoverable errors.-> ErrorEx[Failed validation<br/>File not found]
FATAL -.Critical failures.-> FatalEx[Cannot proceed<br/>System unrecoverable]
style LogLevels fill:#d8e4ff
In Python, the logging facility has the following log levels:
%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace'}}}%%
graph TD
A(["Logging Levels<br/>Least -> Most Severe"]) --> B("DEBUG")
B --> C("INFO")
C --> D("WARNING")
D --> E("ERROR")
E --> F("CRITICAL")
B -.Verbose details.-> B1[Function calls<br/>Variable values]
C -.Normal events.-> C1[Operations succeeded<br/>State changes]
D -.Unexpected but handled.-> D1[Slow responses<br/>Retries needed]
E -.Failures.-> E1[Operations failed<br/>Connections refused]
F -.System failure.-> F1[Cannot continue<br/>Data corruption]
style A fill:#d8e4ff
Someone clicks on a particular tab in your Shiny app.
R logger: TRACE or DEBUG
Tab clicks are fine-grained UI interaction details, so we could use TRACE if we want extremely detailed user behavior tracking, or DEBUG if we only need this during development/troubleshooting. In production, we typically wouldn’t log every tab click unless tracking user flow.
Python logging: DEBUG
Similar reasoning - this diagnostic information for user interactions are primarily useful during development (or when debugging specific user behavior issues).
In most production apps, we’d disable TRACE/DEBUG levels and only enable them when investigating specific issues to avoid log bloat.
Someone puts an invalid entry into a text entry box.
R logger: WARN
The user provided invalid input, but the app can handle it gracefully (show validation message). This is a potential issue worth noting but doesn’t break functionality.
Python logging: WARNING
Same reasoning - unexpected but handled situation. We’d want to track validation failures for UX improvements without treating them as errors.
Some developers use INFO for validation failures if they’re expected and very common. Use WARN if you want to monitor frequency or patterns of invalid inputs.
An HTTP call your app makes to an external API fails.
R logger: ERROR
The API call failed, which means our app cannot complete an intended operation. Even if we have fallback logic, the primary operation failed and needs attention.
Python logging: ERROR
Same reasoning - this is a genuine failure that prevents the app from functioning as designed.
Nuances:
ERROR if this breaks core functionalityWARN if we have robust fallback mechanisms and the app continues normallyFATAL only if the API is so critical that the entire app becomes unusable without itInclude the HTTP status code, error message, and endpoint in the log:
log_error("API call failed: {endpoint} returned {status_code} - {error_msg}")The numeric values that are going into your computational function.
R logger: TRACE
These are extremely granular details about data flowing through our functions, so we’d use TRACE for this level of variable inspection. We might use DEBUG if these values are critical decision points.
Python logging: DEBUG
Python doesn’t have TRACE, so DEBUG is appropriate for detailed variable state information used in troubleshooting computational logic.
When to log this:
| Event | logger |
logging |
Key Factor |
|---|---|---|---|
| Tab click | TRACE/DEBUG |
DEBUG |
Fine-grained UI interaction |
| Invalid input | WARN |
WARNING |
Expected, handled issue |
| API failure | ERROR |
ERROR |
Operation failed |
| Numeric values | TRACE |
DEBUG |
Variable state inspection |
General Rules of Thumb:
TRACE/DEBUG = “I need to see what’s happening inside”
INFO/SUCCESS = “Normal operations worth recording”
WARN/WARNING = “Something unexpected but manageable”
ERROR = “Something broke but app continues”
FATAL/CRITICAL = “Cannot continue operating”
We can’t prevent code failures (or when they happen), but we can create observable systems that 1) accept failure is inevitable (rather than trying to prevent it), 2) implement comprehensive monitoring and logging, and 3) enable rapid diagnosis (i.e., using logs and metrics to reconstruct what failed and why).
To do this, data scientist and developers need to generate meaningful logs and metrics from their code. System admins need to ensure new data science work can be easily integrated within the organizational infrastructure (i.e., into the existing log aggregation and monitoring tools).
%%{init: {'theme': 'neutral', 'look': 'handDrawn', 'themeVariables': { 'fontFamily': 'monospace'}}}%%
graph LR
subgraph DatSci["Data Scientist"]
A(["Generate Logs"])
B(["Emit Metrics"])
end
subgraph SysAdmin["System Admins"]
C("Collect &<br>Aggregate")
D("Store &<br>Process")
E("Monitor &<br>Alert")
end
subgraph Out["Outcomes"]
F(["Detect<br>Issues"])
G(["Diagnose<br>Problems"])
H(["Rapid<br>Recovery"])
end
A ---> C
B ---> C
C --> D
D --> E
E --> F & G & H
classDef ds fill:#d8e4ff,stroke:#000,stroke-width:1px
class A,B ds
classDef outcome fill:#31e981,stroke:#000,stroke-width:1px
class F,G,H outcome
This will give us:
Operational peace of mind (the ability to detect and resolve issues quickly)
Value demonstration (shows decision makers who uses the work and how)
Team validation (prove the team’s impact and justify resources)
library(shiny)
api_url <- "http://127.0.0.1:8080/predict"
log <- log4r::logger()
ui <- fluidPage(
titlePanel("Penguin Mass Predictor"),
# Model input values
sidebarLayout(
sidebarPanel(
sliderInput(
"bill_length",
"Bill Length (mm)",
min = 30,
max = 60,
value = 45,
step = 0.1
),
selectInput(
"sex",
"Sex",
c("Male", "Female")
),
selectInput(
"species",
"Species",
c("Adelie", "Chinstrap", "Gentoo")
),
# Get model predictions
actionButton(
"predict",
"Predict"
)
),
mainPanel(
h2("Penguin Parameters"),
verbatimTextOutput("vals"),
h2("Predicted Penguin Mass (g)"),
textOutput("pred")
)
)
)
server <- function(input, output) {
log4r::info(log, "App Started")
# Input params
vals <- reactive(
list(
bill_length_mm = input$bill_length,
species_Chinstrap = input$species == "Chinstrap",
species_Gentoo = input$species == "Gentoo",
sex_male = input$sex == "Male"
)
)
# Fetch prediction from API
pred <- eventReactive(
input$predict,
{
log4r::info(log, "Prediction Requested")
r <- httr2::request(api_url) |>
httr2::req_body_json(vals()) |>
httr2::req_perform()
log4r::info(log, "Prediction Returned")
if (httr2::resp_is_error(r)) {
log4r::error(log, paste("HTTP Error"))
}
httr2::resp_body_json(r)
},
ignoreInit = TRUE
)
# Render to UI
output$pred <- renderText(pred()$predict[[1]])
output$vals <- renderPrint(vals())
}
# Run the application
shinyApp(ui = ui, server = server)
from shiny import App, render, ui, reactive
import requests
import logging
api_url = 'http://127.0.0.1:8080/predict'
logging.basicConfig(
format='%(asctime)s - %(message)s',
level=logging.INFO
)
app_ui = ui.page_fluid(
ui.panel_title("Penguin Mass Predictor"),
ui.layout_sidebar(
ui.panel_sidebar(
[ui.input_slider("bill_length", "Bill Length (mm)", 30, 60, 45, step = 0.1),
ui.input_select("sex", "Sex", ["Male", "Female"]),
ui.input_select("species", "Species", ["Adelie", "Chinstrap", "Gentoo"]),
ui.input_action_button("predict", "Predict")]
),
ui.panel_main(
ui.h2("Penguin Parameters"),
ui.output_text_verbatim("vals_out"),
ui.h2("Predicted Penguin Mass (g)"),
ui.output_text("pred_out")
)
)
)
def server(input, output, session):
logging.info("App start")
@reactive.Calc
def vals():
d = {
"bill_length_mm" : input.bill_length(),
"sex_Male" : input.sex() == "Male",
"species_Gentoo" : input.species() == "Gentoo",
"species_Chinstrap" : input.species() == "Chinstrap"
}
return d
@reactive.Calc
@reactive.event(input.predict)
def pred():
logging.info("Request Made")
r = requests.post(api_url, json = vals())
logging.info("Request Returned")
if r.status_code != 200:
logging.error("HTTP error returned")
return r.json().get('predict')[0]
@output
@render.text
def vals_out():
return f"{vals()}"
@output
@render.text
def pred_out():
return f"{round(pred())}"
app = App(app_ui, server)