compute_summary_metrics#

pymc_marketing.mmm.evaluation.compute_summary_metrics(y_true, y_pred, metrics_to_calculate=None, hdi_prob=0.94)[source]#

Evaluate the model by calculating metric distributions and summarizing them.

This method combines the functionality of calculate_metric_distributions and summarize_metric_distributions.

Parameters:

y_truenpt.NDArray | pd.Series

The true values of the target variable.

y_prednpt.NDArray | xr.DataArray

The predicted values of the target variable.

metrics_to_calculatelist of str or None, optional

List of metrics to calculate. Options include:

r_squared: Bayesian R-squared.
rmse: Root Mean Squared Error.
nrmse: Normalized Root Mean Squared Error.
mae: Mean Absolute Error.
nmae: Normalized Mean Absolute Error.
mape: Mean Absolute Percentage Error.

Defaults to all metrics if None.

hdi_probfloat, optional

The probability mass of the highest density interval. Defaults to 0.94.

Returns:

dict of str to dict

A dictionary containing summary statistics for each metric. List of summary statistics calculated for each metric:

mean: Mean of the metric distribution.

median: Median of the metric distribution.

std: Standard deviation of the metric distribution.

min: Minimum value of the metric distribution.

max: Maximum value of the metric distribution.

hdi_lower: Lower bound of the Highest Density Interval.

hdi_upper: Upper bound of the Highest Density Interval.

Examples

Evaluation (error and model metrics) for a PyMC-Marketing MMM.

import pandas as pd

from pymc_marketing.mmm import (
    GeometricAdstock,
    LogisticSaturation,
    MMM,
)
from pymc_marketing.paths import data_dir
from pymc_marketing.mmm.evaluation import compute_summary_metrics

# Usual PyMC-Marketing demo model code
file_path = data_dir / "mmm_example.csv"
data = pd.read_csv(file_path, parse_dates=["date_week"])

X = data.drop("y", axis=1)
y = data["y"]
mmm = MMM(
    adstock=GeometricAdstock(l_max=8),
    saturation=LogisticSaturation(),
    date_column="date_week",
    channel_columns=["x1", "x2"],
    control_columns=[
        "event_1",
        "event_2",
        "t",
    ],
    yearly_seasonality=2,
)
mmm.fit(X, y)

# Generate posterior predictive samples
posterior_preds = mmm.sample_posterior_predictive(X)

# Evaluate the model
results = compute_summary_metrics(
    y_true=mmm.y,
    y_pred=posterior_preds.y,
    metrics_to_calculate=["r_squared", "rmse", "mae"],
    hdi_prob=0.89,
)

# Print the results neatly
for metric, stats in results.items():
    print(f"{metric}:")
    for stat, value in stats.items():
        print(f"  {stat}: {value:.4f}")
    print()

# r_squared:
#   mean: 0.9055
#   median: 0.9061
#   std: 0.0098
#   min: 0.8669
#   max: 0.9371
#   89%_hdi_lower: 0.8891
#   89%_hdi_upper: 0.9198
#
# rmse:
#   mean: 351.9120
#   median: 351.0219
#   std: 19.4732
#   min: 290.6544
#   max: 418.0821
#   89%_hdi_lower: 317.0673
#   89%_hdi_upper: 378.1048
#
# mae:
#   mean: 281.6953
#   median: 281.2757
#   std: 16.3375
#   min: 234.1462
#   max: 337.9461
#   89%_hdi_lower: 255.7273
#   89%_hdi_upper: 307.2391