Custom Models with MMM components#

The underlying components used in the MMM class provide flexibility to build other, custom models. With a little knowledge of PyMC and how to customize these PyMC-Marketing components, a lot of different use-cases can be covered.

This notebook is not an introduction but rather an advance example for those trying to understand the PyMC-Marketing internals for flexibility for custom use-cases.

Overview#

This notebook will cover the currently exposed model components from the PyMC-Marketing API. At the moment, this includes:

media transformations
- adstock: how today’s media has an effect in the future
- saturation: the diminishing returns for media

For each of these, the flexibility and customization will be showcased and combined together in a toy model.

Setup#

from functools import partial

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import xarray as xr

from pymc_marketing import mmm

az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [12, 7]
plt.rcParams["figure.dpi"] = 100

%config InlineBackend.figure_format = "retina"

seed = sum(map(ord, "PyMC-Marketing provides flexible model components"))
rng = np.random.default_rng(seed)

draw = partial(pm.draw, random_seed=rng)

Media Transformations#

There are classes for each of the adstock and saturation transformations. They can be imported from the pymc_marketing.mmm module.

saturation = mmm.MichaelisMentenSaturation()

Saturation curves can take many different forms. In this example, we will use the Michaelis Menten curve which we provide in the MichaelisMentenSaturation class.

This curve has two parameters, alpha and lam.

A characteristic of these curves are diminishing returns in order to indicate saturation of a media variable. This can be seen in the plateauing as x increases.

../../_images/be123340f288f5ab0078122415c61fdba9fab378e3e0fe4c0cacd9dd94bd5bd2.png

Adding Parameter Dimensions#

In most cases, a separate saturation function will be estimated for each media channel. A dimension needs to be added to the prior of the function parameters to account for this.

Let’s create some example data to work toward this transformation.

For this example, we will have 2 years of media spend for 4 channels

n_dates = 52 * 2
dates = pd.date_range("2023-01-01", periods=n_dates, freq="W-MON")

channels = ["C1", "C2", "C3", "C4"]

coords = {
    "date": dates,
    "channel": channels,
}

df_spends = random_spends(coords=coords).to_pandas()
df_spends.head()

channel	C1	C2	C3	C4
date
2023-01-02	0.0	0.992756	0.000000	0.000000
2023-01-09	0.0	0.902103	0.203395	0.000000
2023-01-16	0.0	0.559487	0.000000	0.000000
2023-01-23	0.0	0.990124	0.000000	2.013755
2023-01-30	0.0	0.753384	0.000000	0.377570

../../_images/796cd11e2138edce933b131d3bede0700242c061f81b8fe02f5d3b078741177d.png

As mentioned, the default priors do not have a channel dimension. In order to use with the in our model with “channel” dim, we have to add the dims to each of the function priors.

for config in saturation.function_priors.values():
    config["dims"] = "channel"

saturation.function_priors

{'alpha': {'dist': 'Gamma',
  'kwargs': {'mu': 2, 'sigma': 1},
  'dims': 'channel'},
 'lam': {'dist': 'HalfNormal', 'kwargs': {'sigma': 1}, 'dims': 'channel'}}

The previous workflow can be used to understand our priors still. Just pass the coords to the sample_prior method in order to add dims to the appropriate variables.

Since each channel prior is the same, there will just be some noise between the HDI and curve samples.

curve = saturation.sample_curve(prior)
saturation.plot_curve(curve, subplot_kwargs={"sharey": True});

Sampling: []

../../_images/ce317ca284fa45f778b87a51d39153069bdcea01cfc6394ed22fa24033a39141.png

Using in PyMC Model#

When using the transformation in a larger PyMC model, the apply method will be used.

This method will:

create distributions based on prior specification of the instance
apply the transformation to the data

The dims parameter is the shape of parameters and not the data. The data has a different shape but will need to be broadcastable with the parameters!

with pm.Model(coords=coords) as model:
    saturated_spends = saturation.apply(df_spends, dims="channel")

Since independent alpha and lam were specified, we see that in the model graph below:

pm.model_to_graphviz(model)

../../_images/229a9584ac27e481679ad2232e175de3e0c836ed25eb22824e9204e42c2a258a.svg

Note

Neither the df_spends nor saturated_spends show in the model. If needed, use pm.Data and pm.Deterministic to save off.

Our variable will be (date, channel) dims.

saturated_spends.type.shape

(104, 4)

We can manipulate this in anyway we’d like to connect it in with the larger model.

Changing Assumptions#

As hinted above, the priors for the function parameters are customizable which can lead to many different models. Change the priors, change the model.

The prior distributions just need to follow the distribution API here.

Instead of the defaults, we can use:

hierarchical parameter for lam parameter
common alpha parameter

lam_distribution = {
    "dist": "HalfNormal",
    "kwargs": {
        "sigma": {
            "dist": "HalfNormal",
            "kwargs": {"sigma": 1},
        }
    },
    "dims": "channel",
}
priors = {
    "lam": lam_distribution,
}

saturation = mmm.MichaelisMentenSaturation(priors=priors)

saturation.function_priors

{'alpha': {'dist': 'Gamma', 'kwargs': {'mu': 2, 'sigma': 1}},
 'lam': {'dist': 'HalfNormal',
  'kwargs': {'sigma': {'dist': 'HalfNormal', 'kwargs': {'sigma': 1}}},
  'dims': 'channel'}}

Then this can be used in a new PyMC model which leads to a much different model graph than before!

with pm.Model(coords=coords) as model:
    saturated_spends = saturation.apply(df_spends, dims="channel")

pm.model_to_graphviz(model)

../../_images/a4f0ccf342e8c9ba825560a18df2243005a5fa138800e52926ebe1aa8e39a315.svg

The shape of the output will still be (date, channel) even though some of the parameter’s dims has changed.

saturated_spends.type.shape

(104, 4)

The previous workflow still helps us understand the produced curves:

sample_prior
sample_curve
plot_curve

prior = saturation.sample_prior(coords=coords, random_seed=rng)

Sampling: [saturation_alpha, saturation_lam, saturation_lam_sigma]

Though they all look the same in the prior, the data generation process is indeed different as seen in the model graph.

curve = saturation.sample_curve(prior)
saturation.plot_curve(curve, subplot_kwargs={"sharey": True});

Sampling: [saturation_lam, saturation_lam_sigma]

../../_images/f228defff8e8fa1f5c08be295ddc4a41ae6e3fc50e955d80edcdddc5bb296072.png

Geo Hierarchical Model#

The dimensions of the parameters are not limited to 1D so additional hierarchies can be defined.

Below defines:

alpha which is hierarchical across channels
lam which is common across all geos but different channels

# For reference
mmm.MichaelisMentenSaturation.default_priors

{'alpha': {'dist': 'Gamma', 'kwargs': {'mu': 2, 'sigma': 1}},
 'lam': {'dist': 'HalfNormal', 'kwargs': {'sigma': 1}}}

priors = {
    "alpha": {
        "dist": "Gamma",
        "kwargs": {
            "mu": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": "geo"},
            "sigma": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": "geo"},
        },
        "dims": ("channel", "geo"),
    },
    "lam": {
        "dist": "HalfNormal",
        "kwargs": {"sigma": 1},
        "dims": "channel",
    },
}
saturation = mmm.MichaelisMentenSaturation(priors=priors)

Our new data set needs to have information for geo now. This is channel spends by date and geo. This is stored in an xarray.DataArray which can be converted to a 3D numpy.ndarray.

Displaying the data is easy with pandas.

geo_coords = {
    **coords,
    "geo": ["Region1", "Region2", "Region3"],
}

geo_spends = random_spends(coords=geo_coords)

geo_spends.to_series().unstack("channel").head(6)

	channel	C1	C2	C3	C4
date	geo
2023-01-02	Region1	0.000000	0.000000	0.541483	0.0
	Region2	0.213863	0.000000	0.000000	0.0
	Region3	1.100906	0.692670	1.493263	0.0
2023-01-09	Region1	0.309415	0.614882	0.818824	0.0
	Region2	2.264686	0.930828	0.028902	0.0
	Region3	0.433401	0.660699	1.318996	0.0

As long as the dims argument of apply can broadcast with the data going in, then the media transformations can be used!

Here, the data is in the shape (date, channel, geo) so it can broadcast with the parameters in shape (channel, geo) to create the saturated spends.

with pm.Model(coords=geo_coords) as geo_model:
    geo_data = pm.Data(
        "geo_data",
        geo_spends.to_numpy(),
        dims=("date", "channel", "geo"),
    )
    saturated_geo_spends = pm.Deterministic(
        "saturated_geo_spends",
        saturation.apply(geo_data, dims=("channel", "geo")),
        dims=("date", "channel", "geo"),
    )

The saturation assumptions can be seen in the model graph:

pm.model_to_graphviz(geo_model)

../../_images/da65bc5428d897474dddfc9420c81824859f8065fc29f825bc834ce33e9563cd.svg

Tip

The PyMC model context will stay the same but changing model assumptions will happen with input data and prior configuration!

Example Custom MMM#

Lots of flexibility by combining them together. This will build off the example above and now include an adstock transformation of the data as well.

def create_media_transformation(adstock, saturation, adstock_first: bool = True):
    """Flexible media transformation which allows for order to transformations."""
    first, second = (adstock, saturation) if adstock_first else (saturation, adstock)

    def media_transformation(x, dims):
        return second.apply(first.apply(x, dims=dims), dims=dims)

    return media_transformation

Our adstock function will have a hierarchical parameter for each geo. Our configuration is flexible enough to change this as long as the final dims will broadcast with the data!

# For reference
mmm.GeometricAdstock.default_priors

{'alpha': {'dist': 'Beta', 'kwargs': {'alpha': 1, 'beta': 3}}}

These are also hierarchical parameters but across a different dimensions than the saturation transformation.

priors = {
    "alpha": {
        "dist": "Beta",
        "kwargs": {
            "alpha": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": "channel"},
            "beta": {"dist": "HalfNormal", "kwargs": {"sigma": 1}, "dims": "channel"},
        },
        "dims": ("channel", "geo"),
    },
}
adstock = mmm.GeometricAdstock(l_max=10, priors=priors)

media_transformation = create_media_transformation(
    adstock, saturation, adstock_first=True
)

with pm.Model(coords=geo_coords) as geo_model:
    geo_data = pm.Data(
        "geo_data",
        geo_spends.to_numpy(),
        dims=("date", "channel", "geo"),
    )
    saturated_geo_spends = pm.Deterministic(
        "channel_contributions",
        media_transformation(geo_data, dims=("channel", "geo")),
        dims=("date", "channel", "geo"),
    )

The combined data generation process can be seen below:

pm.model_to_graphviz(geo_model)

../../_images/047b1eae44872c6b5c436d90da3140c6730c7b175b61f1a8aa48712b421e9e2b.svg

Summary#

Custom models are possible using the components that build up the MMM class. With some prior distribution configuration and the components that PyMC-Marketing provides, novel models can be built up to fit various use-cases and various model assumptions.

Much of the flexibility will come from the prior distribution configuration rather then the transformation themselves. This is meant to keep a standard interface while working with them regardless what their role is.

If there is any suggestions or feedback on how to make better custom models with the package, create a GitHub Issue or chime into the various discussions.

Though models can be built up like this, the prebuilt structures provide many benefits as well. For instance, the MMM class provides:

scaling of input and output data
plotting methods for parameters, predictive data, contributions, etc
customized adstock and saturation transformations
out of sample predictions
lift test integration
budget optimization

Our recommendation is to start with the prebuilt models and work up from there.

%load_ext watermark
%watermark -n -u -v -iv -w -p pytensor

Last updated: Tue Jun 18 2024

Python implementation: CPython
Python version       : 3.10.13
IPython version      : 8.22.2

pytensor: 2.20.0

matplotlib    : 3.8.3
numpy         : 1.26.4
xarray        : 2024.2.0
pymc          : 5.14.0
pandas        : 2.2.1
pymc_marketing: 0.6.0
arviz         : 0.17.0

Watermark: 2.4.3