mlflow#

MLflow logging utilities for PyMC models.

This module provides utilities to log various aspects of PyMC models to MLflow which is then extended to PyMC-Marketing models.

Autologging is supported for PyMC models and PyMC-Marketing models. This including logging of sampler diagnostics, model information, data used in the model, and InferenceData objects.

The autologging can be enabled by calling the autolog function. The following functions are patched:

pymc.sample:
- log_versions(): Log the versions of PyMC-Marketing, PyMC, and ArviZ to MLflow.
- log_model_derived_info(): Log types of parameters, coords, model graph, etc.
- log_sample_diagnostics(): Log information derived from the InferenceData object.
- log_arviz_summary(): Log table of summary statistics about estimated parameters
- log_metadata(): Log the metadata of the data used in the model.
- log_error(): Log the traceback and exception if an error occurs during sampling.
- Stamp the active MLflow run id on idata.attrs["mlflow_run_id"].
pymc.find_MAP:
- log_model_derived_info(): Log types of parameters, coords, model graph, etc.
MMM.fit:
- All parameters, metrics, and artifacts from pymc.sample
- log_mmm_configuration(): Log the configuration of the MMM model.
- Stamp the active MLflow run id on idata.attrs["mlflow_run_id"].
CLVModel.fit:
- Information dependent on fit method used (MCMC or MAP)
- Model type and fit method
- Stamp the active MLflow run id on idata.attrs["mlflow_run_id"].

Examples#

Autologging for a PyMC model:

import mlflow

import pymc as pm

import pymc_marketing.mlflow

pymc_marketing.mlflow.autolog()

# Usual PyMC model code
with pm.Model() as model:
    mu = pm.Normal("mu", mu=0, sigma=1)
    obs = pm.Normal("obs", mu=mu, sigma=1, observed=[1, 2, 3])

# Incorporate into MLflow workflow
mlflow.set_experiment("PyMC Experiment")

with mlflow.start_run():
    idata = pm.sample(model=model)

Autologging for a PyMC-Marketing MMM:

import pandas as pd

import mlflow

from pymc_marketing.mmm import (
    GeometricAdstock,
    LogisticSaturation,
    MMM,
)
from pymc_marketing.paths import data_dir
import pymc_marketing.mlflow

pymc_marketing.mlflow.autolog(log_mmm=True)

# Usual PyMC-Marketing model code

file_path = data_dir / "mmm_example.csv"
data = pd.read_csv(file_path, parse_dates=["date_week"])

X = data.drop("y", axis=1)
y = data["y"]

mmm = MMM(
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    date_column="date_week",
    channel_columns=["x1", "x2"],
    control_columns=[
        "event_1",
        "event_2",
        "t",
    ],
    yearly_seasonality=2,
)

# Incorporate into MLflow workflow

mlflow.set_experiment("MMM Experiment")

with mlflow.start_run():
    idata = mmm.fit(X, y)

    # Additional specific logging
    fig, _ = mmm.plot.contributions_over_time(var=["channel_contribution"])
    mlflow.log_figure(fig, "components.png")

Autologging for a PyMC-Marketing CLV model:

import pandas as pd

import mlflow

from pymc_marketing.clv import BetaGeoModel
from pymc_marketing.paths import data_dir

import pymc_marketing.mlflow

pymc_marketing.mlflow.autolog(log_clv=True)

mlflow.set_experiment("CLV Experiment")

file_path = data_dir / "clv_quickstart.csv"
data = pd.read_csv(file_path)
data["customer_id"] = data.index

model = BetaGeoModel(data=data)

with mlflow.start_run():
    model.fit()

Functions

`autolog`([log_sampler_info, ...])	Autologging support for PyMC models and PyMC-Marketing models.
`create_log_callback`([stats, parameters, ...])	Create callback function to log sample stats and parameter values to MLflow during sampling.
`load_mmm`(run_id[, full_model, keep_idata, ...])	Load a PyMC-Marketing MMM model from MLflow.
`log_arviz_summary`(idata, path[, var_names])	Log the ArviZ summary as an artifact on MLflow.
`log_error`(func, file_name)	Log arbitrary caught error and traceback to MLflow.
`log_inference_data`(idata[, save_file])	Log the InferenceData to MLflow.
`log_likelihood_type`(model)	Save the likelihood type of the model to MLflow.
`log_metadata`(model, idata)	Log the metadata of the data used in the model to MLflow.
`log_mmm`(mmm[, artifact_path, ...])	Log a PyMC-Marketing MMM as a native MLflow model for the current run.
`log_mmm_configuration`(mmm)	Log the configuration of the MMM model to MLflow.
`log_mmm_evaluation_metrics`(y_true, y_pred[, ...])	Log evaluation metrics produced by `pymc_marketing.mmm.evaluation.compute_summary_metrics()` to MLflow.
`log_model_derived_info`(model)	Log various model derived information to MLflow.
`log_model_graph`(model, path)	Log the model graph PDF as artifact on MLflow.
`log_sample_diagnostics`(idata[, tune])	Log sample diagnostics to MLflow.
`log_types_of_parameters`(model)	Log the types of parameters in a PyMC model to MLflow.
`log_versions`()	Log the versions of PyMC-Marketing, PyMC, and ArviZ to MLflow.

Classes

MMMWrapper(model[, predict_method, ...])

A class to prepare a PyMC-Marketing Mix Model (MMM) for logging and registering in MLflow.