Lift Test Calibration#
Introduction#
You may have heard of the phrase “all models are wrong; some models are useful.” This is true in many areas, and it’s likely that after the first attempt, you haven’t yet created a model that can accurately determine your true attribution. Even if you have, how can you be sure? We create models to understand the attribution of our marketing channels, but it appears that even then, we can’t always rely on what the models tell us.
In order to ensure that our models are decomposing correctly, we can use various testing methods to gather real-world data and compare it with our models. This will help us to identify any discrepancies and improve the decomposition accuracy of our models.
Today, we will explore a new way to integrate experiments into pymc-marketing. This will bring us closer to accurate representations of real-world values and improve the estimates generated by our models.
Requirements#
Today, we won’t be discussing how to conduct various tests, but instead, we will focus on their utilization. If you wish to acquire knowledge on how to generate results that are compatible with your MMM models, you can check out CausalPy for conducting experiments.
Goal#
After reading this notebook, you will have gained the necessary expertise to incorporate the results (detected uplift from your experiments) into our regressive model.
This notebook will display using the add_lift_test_measurements
method of DelayedSaturatedMMM
and its workflow:
Build model:
mmm.build_model(X, y)
Add lift measurements:
mmm.add_lift_test_measurements(df_lift_test)
Sample posterior:
mmm.fit(X, y)
This is a case study of two correlated channels to see how lift tests help distinguish the channel effects.
from functools import partial
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
from pymc_marketing.mmm import DelayedSaturatedMMM
from pymc_marketing.mmm.transformers import logistic_saturation
az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [12, 7]
plt.rcParams["figure.dpi"] = 100
%config InlineBacken.figure_format = "retina"
Indistinguishable Parameter Estimates#
In order to show the trouble that completely correlated channels provides, let’s fit the model
fit_kwargs = {"nuts_sampler": "numpyro", "random_seed": rng}
idata_without = mmm.fit(X, y, **fit_kwargs)
Compiling...
An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.
Compilation time = 0:00:04.523585
Sampling...
Sampling time = 0:00:09.100238
Transforming variables...
Transformation time = 0:00:01.207358
Since the spends are completely correlated, there is no way to distinguish the parameters. Not only that, but the parameter estimates are not close to actuals.
Show code cell content
def plot_true_value(value, channel: str, ax: plt.Axes, split: float = 0.42) -> plt.Axes:
top = 2 * split
ymin, ymax = (0, split) if channel == "channel 2" else (split, top)
ax.axvline(value, ymin=ymin, ymax=ymax, color="black", linestyle="dashed")
return ax
ax = az.plot_forest(idata_without, var_names=["lam"], combined=True)[0]
plot_true_value(true_lam_c1, "channel 1", ax=ax, split=0.4)
plot_true_value(true_lam_c2, "channel 2", ax=ax, split=0.4);
ax = az.plot_forest(idata_without, var_names=["beta_channel"], combined=True)[0]
plot_true_value(true_beta_c1, "channel 1", ax=ax, split=0.4)
plot_true_value(true_beta_c2, "channel 2", ax=ax, split=0.4);
And that the direct response curve created from these parameters are completely overlapping.
Unobserved Saturation Curves#
These are the true saturation curve from our model internals. We don’t observe them.
There is a difference between channels. However, it is not possible to tell with just the MMM alone.
def saturation_function(x, lam, beta):
return (beta * logistic_saturation(x, lam)).eval()
step_size = 0.05
xx = np.arange(0, spend.max() * 1.1, step_size)
c1_curve_fn = partial(saturation_function, lam=true_lam_c1, beta=true_beta_c1)
c2_curve_fn = partial(saturation_function, lam=true_lam_c2, beta=true_beta_c2)
c1_curve = c1_curve_fn(xx)
c2_curve = c2_curve_fn(xx)
def plot_actual_curves(ax: plt.Axes, linestyle: str | None = None) -> plt.Axes:
ax.plot(xx, c1_curve, label="channel 1", color="C0", linestyle=linestyle)
ax.plot(xx, c2_curve, label="channel 2", color="C1", linestyle=linestyle)
return ax
def plot_reference(ax: plt.Axes) -> plt.Axes:
ax.plot(xx, xx, label="y=x", color="black", linestyle="--")
return ax
ax = plt.gca()
plot_actual_curves(ax)
plot_reference(ax)
ax.set(
xlabel="channel spend",
ylabel="channel contribution",
title="Actual Saturation Curves (Unobserved)",
)
ax.legend();
About Lift Tests#
In a lift study, one temporarily changes the budget of a channel for a fixed period of time, and then uses some method (for example CausalPy) to make inference about the change in sales directly caused by the adjustment.
A lift test is characterized by:
channel
: the channel that was testedx
: pre-test channel spenddelta_x
: change made to xdelta_y
: inferred change in sales due to delta_xsigma
: standard deviation of delta_y
An experiment characterized in this way can be viewed as two points on the saturation curve for the channel. Accordingly, lift test calibration is implemented by adding a term to the model likelihood, that makes the channel saturation curve (the contribution as a function of spend) align with the test observation.
def create_lift_test_from_actual_curve(
channel: str, x: float, delta_x: float, sigma: float
) -> dict[str, float]:
curve_fn = c1_curve_fn if channel == "channel 1" else c2_curve_fn
delta_y = curve_fn(x + delta_x) - curve_fn(x)
return {
"channel": channel,
"x": x,
"delta_x": delta_x,
"delta_y": delta_y,
"sigma": sigma,
}
df_lift_test = pd.DataFrame(
[
# Channel x1
create_lift_test_from_actual_curve("channel 1", 0.0, 0.05, 0.05),
create_lift_test_from_actual_curve("channel 1", 0.15, 0.05, 0.05),
create_lift_test_from_actual_curve("channel 1", 0.3, 0.05, 0.05),
# Channel x2
create_lift_test_from_actual_curve("channel 2", 0.5, 0.05, 0.10),
]
)
df_lift_test
channel | x | delta_x | delta_y | sigma | |
---|---|---|---|---|---|
0 | channel 1 | 0.00 | 0.05 | 0.134705 | 0.05 |
1 | channel 1 | 0.15 | 0.05 | 0.069545 | 0.05 |
2 | channel 1 | 0.30 | 0.05 | 0.019925 | 0.05 |
3 | channel 2 | 0.50 | 0.05 | 0.032236 | 0.10 |
Show code cell content
def plot_triangle(
x,
delta_x,
delta_y,
color: str,
ax: plt.Axes,
offset: float = 0,
label: str | None = None,
) -> plt.Axes:
x_after = x + delta_x
y = offset
y_after = y + delta_y
ax.plot([x, x_after], [y, y], color=color, label=label)
ax.plot([x_after, x_after], [y, y_after], color=color)
ax.plot([x, x_after], [y, y_after], color=color, linestyle="dashed")
return ax
def plot_channel_triangles(
df: pd.DataFrame, color: str, ax: plt.Axes, label: str
) -> plt.Axes:
kwargs = {"label": label}
for _, row in df.iterrows():
plot_triangle(
row["x"], row["delta_x"], row["delta_y"], ax=ax, color=color, **kwargs
)
if "label" in kwargs:
kwargs.pop("label")
return ax
def plot_lift_test_triangles(df: pd.DataFrame, ax: plt.Axes) -> plt.Axes:
idx = df["channel"] == "channel 1"
plot_channel_triangles(df.loc[idx], color="C0", ax=ax, label="channel 1")
plot_channel_triangles(df.loc[~idx], color="C1", ax=ax, label="channel 2")
return ax
Below we are visualizing each lift test observation as a triangle. The base of the triangle shows the change in the spend and height of triangle is change in contribution. Each base is at y=0 since only the change in contribution is observed.
Even though these are not the saturation curve, they can be used to better estimate the saturation curve. We can see that channel 2 is slower to saturation since the change in y is larger for higher x values.
Add Lift Tests to Model#
Having created a DelayedSaturatedMMM
model instance, mmm
and built it using the build_model
method or fit with fit
method, we can add lift test results to the model using the add_lift_test_measurments
method.
mmm.add_lift_test_measurements(df_lift_test)
We can see the model graph is modified with new observation for our lift measurements.
The observation distribution is assumed to be Gamma
as each saturation curve is monotonically increasing given a set of parameters.
We can refit the model but with the lift tests included
idata_with = mmm.fit(X, y, **fit_kwargs)
Compiling...
Compilation time = 0:00:05.530146
Sampling...
Sampling time = 0:00:08.329839
Transforming variables...
Transformation time = 0:00:00.701632
The model gets shaped by the lift test measurements and the response curves begin to separate!
Show code cell content
def plot_channel_rug(
df: pd.DataFrame, color: str, ax: plt.Axes, height: float
) -> plt.Axes:
for x in df["x"].to_numpy():
ax.axvline(x, ymin=0, ymax=height, color=color)
return ax
def plot_lift_test_rug(df, ax, height: float = 0.05) -> plt.Axes:
idx = df["channel"] == "channel 1"
plot_channel_rug(df.loc[idx], color="C0", ax=ax, height=height)
plot_channel_rug(df.loc[~idx], color="C1", ax=ax, height=height)
return ax
fig = mmm.plot_direct_contribution_curves(same_axes=True)
ax = fig.axes[0]
plot_actual_curves(ax=ax, linestyle="dashed")
plot_lift_test_rug(df_lift_test, ax);
This can be seen in the parameter estimates that make up this response curve
Show code cell content
def plot_comparison(data, model_names, var_name: str) -> plt.Axes:
return az.plot_forest(
data,
model_names=model_names,
var_names=[var_name],
combined=True,
)[0]
data = [idata_without, idata_with]
model_names = ["without lift tests", "with lift tests"]
ax = plot_comparison(data, model_names, "lam")
plot_true_value(true_lam_c1, "channel 1", ax)
plot_true_value(true_lam_c2, "channel 2", ax);
Add Additional Lift Tests#
We can add even more lift tests.
They can either all be added at one time or separately. Use the name
parameter in order to separate the two sets of observations in the model graph.
df_additional_lift_test = pd.DataFrame(
[
# More for Channel x1
create_lift_test_from_actual_curve("channel 1", 0.1, 0.05, sigma=0.01),
create_lift_test_from_actual_curve("channel 1", 0.5, 0.05, sigma=0.01),
# More for channel x2
create_lift_test_from_actual_curve("channel 2", 0.3, 0.05, sigma=0.01),
]
)
df_additional_lift_test
channel | x | delta_x | delta_y | sigma | |
---|---|---|---|---|---|
0 | channel 1 | 0.1 | 0.05 | 0.095167 | 0.01 |
1 | channel 1 | 0.5 | 0.05 | 0.002885 | 0.01 |
2 | channel 2 | 0.3 | 0.05 | 0.035354 | 0.01 |
mmm.add_lift_test_measurements(df_additional_lift_test, name="more_lift_measurements")
pm.model_to_graphviz(mmm.model)
idata_with_more = mmm.fit(X, y, **fit_kwargs)
Compiling...
Compilation time = 0:00:06.384169
Sampling...
Sampling time = 0:00:09.624948
Transforming variables...
Transformation time = 0:00:00.746753
The response curve is shifting more and more
df_all_lift_tests = pd.concat([df_lift_test, df_additional_lift_test])
fig = mmm.plot_direct_contribution_curves(same_axes=True)
ax = fig.axes[0]
plot_actual_curves(ax=ax, linestyle="dashed")
plot_lift_test_rug(df_all_lift_tests, ax);
data = [idata_without, idata_with, idata_with_more]
model_names = ["without lift tests", "with lift tests", "with more lift tests"]
ax = plot_comparison(data, model_names, "lam")
plot_true_value(true_lam_c1, "channel 1", ax, split=0.435)
plot_true_value(true_lam_c2, "channel 2", ax, split=0.435);
Conclusion#
The add_lift_test_measurements
method can be used in order to incorporate experiments into our model likelihood and nudge the model parameters closer to the actuals in this example.
Conducting various experiments for each channel at various spends will bring the best results.