Python processing layer

Published

2025-09-23

WarningCaution

This section is being revised. Thank you for your patience.

Below is the code I included in _labs/lab2/model-vetiver/model-vetiver.qmd:

library(reticulate)
use_virtualenv("myenv", required = TRUE)
virtualenv_install(
  envname = "myenv", 
  packages = c("palmerpenguins", "pandas", "numpy", 
               "scikit-learn", "duckdb", "vetiver", 
                "pins"))
from palmerpenguins import load_penguins
from pandas import get_dummies
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing
import duckdb
1
Added this for the penguins data

Get Data

penguins_data = load_penguins()
con = duckdb.connect('my-db.duckdb')
con.execute("CREATE OR REPLACE TABLE penguins AS SELECT * FROM penguins_data")
df = con.execute("SELECT * FROM penguins").fetchdf().dropna()
con.close()

df.head(3)
1
Load the data for the duckdb
2
Create the penguins table in db

Define Model and Fit

X = get_dummies(df[['bill_length_mm', 'species', 'sex']], drop_first = True)
y = df['body_mass_g']

model = LinearRegression().fit(X, y)

Get some information

print(f"R^2 {model.score(X,y)}")
print(f"Intercept {model.intercept_}")
print(f"Columns {X.columns}")
print(f"Coefficients {model.coef_}")

Turn into Vetiver Model

from vetiver import VetiverModel
v = VetiverModel(model, model_name='penguin_model', prototype_data=X)

Save to Board

import os
from pins import board_folder
from vetiver import vetiver_pin_write

model_board = board_folder("./models", allow_pickle_read=True)

vetiver_pin_write(model_board, v)
1
For creating folder
2
Use the current working directory

Turn model into API

from vetiver import VetiverAPI
app = VetiverAPI(v, check_prototype = True)

The original code from lab 2 is below:

View model-vetiver.qmd


---
title: "Model"
format:
  html:
    code-fold: true
---

```{python}
from pandas import get_dummies
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing
```

## Get Data

```{python}
import duckdb
con = duckdb.connect('my-db.duckdb')
df = con.execute("SELECT * FROM penguins").fetchdf().dropna()
con.close()

df.head(3)
```

## Define Model and Fit

```{python}
X = get_dummies(df[['bill_length_mm', 'species', 'sex']], drop_first = True)
y = df['body_mass_g']

model = LinearRegression().fit(X, y)
```

## Get some information

```{python}
print(f"R^2 {model.score(X,y)}")
print(f"Intercept {model.intercept_}")
print(f"Columns {X.columns}")
print(f"Coefficients {model.coef_}")
```

## Turn into Vetiver Model

```{python}
from vetiver import VetiverModel
v = VetiverModel(model, model_name='penguin_model', prototype_data=X)
```

## Save to Board

```{python}
from pins import board_folder
from vetiver import vetiver_pin_write

model_board = board_folder("/data/model", allow_pickle_read = True)
vetiver_pin_write(model_board, v)
```

## Turn model into API

```{python}
from vetiver import VetiverAPI
app = VetiverAPI(v, check_prototype = True)
```