ShiftedBetaGeoModel#

class pymc_marketing.clv.models.shifted_beta_geo.ShiftedBetaGeoModel(data=None, *, model_config=None, sampler_config=None)[source]#

Shifted Beta Geometric (sBG) model for customers renewing contracts over discrete time periods.

The sBG model has the following assumptions:

  • Dropout probabilities for each cohort are Beta-distributed with hyperparameters alpha and beta.

  • Cohort retention rates change over time due to customer heterogeneity.

  • Customers in the same cohort began their contract in the same time period.

This model requires data to be summarized by recency, T, and cohort for each customer. Modeling assumptions require 1 <= recency <= T, and T >= 2.

First introduced by Fader & Hardie in [1], with additional expressions and enhancements described in [2] and [3].

Parameters:
dataDataFrame

DataFrame containing the following columns:

  • customer_id: Unique customer identifier.

  • recency: Time period of last contract renewal. It should equal T for active customers.

  • T: Max observed time period in the cohort. All customers in a given cohort share the same value for T.

  • cohort: Customer cohort label.

  • Any columns listed in dropout_covariate_cols when using covariates.

model_configdict, optional

Dictionary of model prior parameters:

  • alpha: Prior or None (cohort-level). Shape parameter of dropout process. Default is phi * kappa when alpha is not provided directly.

  • beta: Prior or None (cohort-level). Shape parameter of dropout process. Default is (1 - phi) * kappa when beta is not provided directly.

  • phi: Prior for pooling if alpha and beta are not provided directly; default Prior("Uniform", lower=0, upper=1, dims="cohort").

  • kappa: Prior for pooling if alpha and beta are not provided directly; default Prior("Pareto", alpha=1, m=1, dims="cohort").

  • dropout_coefficient: Prior for covariate coefficients; default Prior("Normal", mu=0, sigma=1).

  • dropout_covariate_cols: Sequence[str]. Column names for customer-level, time-invariant covariates; default [].

sampler_configdict, optional

Dictionary of sampler parameters. Defaults to None.

Notes

Example:#

Required data format:

customer_id

recency

T

cohort

discrete_covariate

continuous_covariate

1

8

8

2025-02

1

2.172

2

1

5

2025-04

0

1.234

3

4

5

2025-04

1

2.345

Example usage:

from pymc_extras.prior import Prior
from pymc_marketing.clv import ShiftedBetaGeoModel

model = ShiftedBetaGeoModel(
    model_config={
        "alpha": Prior("HalfNormal", sigma=10),
        "beta": Prior("HalfStudentT", nu=4, sigma=10),
    },
    sampler_config={
        "draws": 1000,
        "tune": 1000,
        "chains": 4,
        "cores": 4,
        "nuts_kwargs": {"target_accept": 0.95},
    },
)

# Fit model quickly to large datasets via Maximum a Posteriori
model.fit(method="map")
model.fit_summary()

# Use 'mcmc' for more informative predictions and reliable performance on smaller datasets
model.fit(data=data,method="mcmc")
model.fit_summary()

# Predict probability customers are still active
expected_alive_probability = model.expected_probability_alive(
    active_customers,
    future_t=0,
)

# Predict retention rate for a specific cohort
cohort_name = "2025-02-01"

expected_alive_probability = model.expected_retention_rate(
    future_t=0,
).sel(cohort=cohort_name)

# Predict expected remaining lifetime for all customers with a 5% discount rate
expected_alive_probability = model.expected_residual_lifetime(
    discount_rate=0.05,
)

# Predict expected retention elasticity for all customers in a specific cohort
expected_alive_probability = model.expected_retention_elasticity(
    discount_rate=0.05,
).sel(cohort=cohort_name)

# Example with customer-level covariates
model_with_covariates = ShiftedBetaGeoModel(
    ),
    model_config={
        "dropout_coefficient": Prior("Normal", mu=0, sigma=2),
        "dropout_covariate_cols": ["covariate1", "covariate2"],
    },
)
model_with_covariates.fit(data=covariate_data, method="demz")

References

[1]

Fader, P. S., & Hardie, B. G. (2007). “How to project customer retention.” Journal of Interactive Marketing, 21(1), 76-90. PDF

[2]

Fader, P. S., & Hardie, B. G. (2010). “Customer-Base Valuation in a Contractual Setting: The Perils of Ignoring Heterogeneity.” Marketing Science, 29(1), 85-93. PDF

[3]

Fader, P., & Hardie, B. (2007). “Incorporating Time-Invariant Covariates into the Pareto/NBD and BG/NBD Models.” Note 019

Methods

ShiftedBetaGeoModel.__init__([data, ...])

Initialize model configuration and sampler configuration for the model.

ShiftedBetaGeoModel.attrs_to_init_kwargs(attrs)

Convert the model configuration and sampler configuration from the attributes to keyword arguments.

ShiftedBetaGeoModel.build_from_idata(idata)

Build the model from the InferenceData object.

ShiftedBetaGeoModel.build_model([data])

Build the model.

ShiftedBetaGeoModel.create_idata_attrs()

Create attributes for the inference data.

ShiftedBetaGeoModel.expected_probability_alive([...])

Compute expected probability of contract renewal for each customer.

ShiftedBetaGeoModel.expected_residual_lifetime([...])

Compute expected residual lifetime of each customer.

ShiftedBetaGeoModel.expected_retention_elasticity([...])

Compute expected retention elasticity for each customer.

ShiftedBetaGeoModel.expected_retention_rate([...])

Compute expected retention rate for each customer.

ShiftedBetaGeoModel.fit([data, method, ...])

Infer model posterior.

ShiftedBetaGeoModel.fit_summary(**kwargs)

Compute the summary of the fit result.

ShiftedBetaGeoModel.graphviz(**kwargs)

Get the graphviz representation of the model.

ShiftedBetaGeoModel.idata_to_init_kwargs(idata)

Create the initialization kwargs from an InferenceData object.

ShiftedBetaGeoModel.load(fname[, check])

Create a ModelBuilder instance from a file.

ShiftedBetaGeoModel.load_from_idata(idata[, ...])

Create a ModelBuilder instance from an InferenceData object.

ShiftedBetaGeoModel.save(fname, **kwargs)

Save the model's inference data to a file.

ShiftedBetaGeoModel.set_idata_attrs([idata])

Set attributes on an InferenceData object.

ShiftedBetaGeoModel.table(**model_table_kwargs)

Get the summary table of the model.

ShiftedBetaGeoModel.thin_fit_result(keep_every)

Return a copy of the model with a thinned fit result.

Attributes

cohort_idx

Cohort indices for each customer.

cohorts

Unique cohort values from data.

default_model_config

Default model configuration.

default_sampler_config

Default sampler configuration.

dropout_covariate_cols

Dropout covariate column names from model_config.

fit_result

Get the posterior fit_result.

id

Generate a unique hash value for the model.

posterior

Access the 'posterior' attribute of the InferenceData object.

posterior_predictive

Access the 'posterior_predictive' attribute of the InferenceData object.

predictions

Access the 'predictions' attribute of the InferenceData object.

prior

Access the 'prior' attribute of the InferenceData object.

prior_predictive

Access the 'prior_predictive' attribute of the InferenceData object.

version

idata

sampler_config

model_config