ShiftedBetaGeoModel#
- class pymc_marketing.clv.models.shifted_beta_geo.ShiftedBetaGeoModel(data=None, *, model_config=None, sampler_config=None)[source]#
Shifted Beta Geometric (sBG) model for customers renewing contracts over discrete time periods.
The sBG model has the following assumptions:
Dropout probabilities for each cohort are Beta-distributed with hyperparameters
alphaandbeta.Cohort retention rates change over time due to customer heterogeneity.
Customers in the same cohort began their contract in the same time period.
This model requires data to be summarized by recency, T, and cohort for each customer. Modeling assumptions require 1 <= recency <= T, and T >= 2.
First introduced by Fader & Hardie in [1], with additional expressions and enhancements described in [2] and [3].
- Parameters:
- data
DataFrame DataFrame containing the following columns:
customer_id: Unique customer identifier.recency: Time period of last contract renewal. It should equalTfor active customers.T: Max observed time period in the cohort. All customers in a given cohort share the same value forT.cohort: Customer cohort label.Any columns listed in
dropout_covariate_colswhen using covariates.
- model_config
dict, optional Dictionary of model prior parameters:
alpha: Prior or None (cohort-level). Shape parameter of dropout process. Default isphi * kappawhenalphais not provided directly.beta: Prior or None (cohort-level). Shape parameter of dropout process. Default is(1 - phi) * kappawhenbetais not provided directly.phi: Prior for pooling ifalphaandbetaare not provided directly; defaultPrior("Uniform", lower=0, upper=1, dims="cohort").kappa: Prior for pooling ifalphaandbetaare not provided directly; defaultPrior("Pareto", alpha=1, m=1, dims="cohort").dropout_coefficient: Prior for covariate coefficients; defaultPrior("Normal", mu=0, sigma=1).dropout_covariate_cols: Sequence[str]. Column names for customer-level, time-invariant covariates; default[].
- sampler_config
dict, optional Dictionary of sampler parameters. Defaults to None.
- data
Notes
Example:#
Required
dataformat:customer_id
recency
T
cohort
discrete_covariate
continuous_covariate
1
8
8
2025-02
1
2.172
2
1
5
2025-04
0
1.234
3
4
5
2025-04
1
2.345
Example usage:
from pymc_extras.prior import Prior from pymc_marketing.clv import ShiftedBetaGeoModel model = ShiftedBetaGeoModel( model_config={ "alpha": Prior("HalfNormal", sigma=10), "beta": Prior("HalfStudentT", nu=4, sigma=10), }, sampler_config={ "draws": 1000, "tune": 1000, "chains": 4, "cores": 4, "nuts_kwargs": {"target_accept": 0.95}, }, ) # Fit model quickly to large datasets via Maximum a Posteriori model.fit(method="map") model.fit_summary() # Use 'mcmc' for more informative predictions and reliable performance on smaller datasets model.fit(data=data,method="mcmc") model.fit_summary() # Predict probability customers are still active expected_alive_probability = model.expected_probability_alive( active_customers, future_t=0, ) # Predict retention rate for a specific cohort cohort_name = "2025-02-01" expected_alive_probability = model.expected_retention_rate( future_t=0, ).sel(cohort=cohort_name) # Predict expected remaining lifetime for all customers with a 5% discount rate expected_alive_probability = model.expected_residual_lifetime( discount_rate=0.05, ) # Predict expected retention elasticity for all customers in a specific cohort expected_alive_probability = model.expected_retention_elasticity( discount_rate=0.05, ).sel(cohort=cohort_name) # Example with customer-level covariates model_with_covariates = ShiftedBetaGeoModel( ), model_config={ "dropout_coefficient": Prior("Normal", mu=0, sigma=2), "dropout_covariate_cols": ["covariate1", "covariate2"], }, ) model_with_covariates.fit(data=covariate_data, method="demz")
References
[1]Fader, P. S., & Hardie, B. G. (2007). “How to project customer retention.” Journal of Interactive Marketing, 21(1), 76-90. PDF
Methods
ShiftedBetaGeoModel.__init__([data, ...])Initialize model configuration and sampler configuration for the model.
Convert the model configuration and sampler configuration from the attributes to keyword arguments.
Build the model from the InferenceData object.
ShiftedBetaGeoModel.build_model([data])Build the model.
Create attributes for the inference data.
Compute expected probability of contract renewal for each customer.
Compute expected residual lifetime of each customer.
Compute expected retention elasticity for each customer.
Compute expected retention rate for each customer.
ShiftedBetaGeoModel.fit([data, method, ...])Infer model posterior.
ShiftedBetaGeoModel.fit_summary(**kwargs)Compute the summary of the fit result.
ShiftedBetaGeoModel.graphviz(**kwargs)Get the graphviz representation of the model.
Create the initialization kwargs from an InferenceData object.
ShiftedBetaGeoModel.load(fname[, check])Create a ModelBuilder instance from a file.
ShiftedBetaGeoModel.load_from_idata(idata[, ...])Create a ModelBuilder instance from an InferenceData object.
ShiftedBetaGeoModel.save(fname, **kwargs)Save the model's inference data to a file.
ShiftedBetaGeoModel.set_idata_attrs([idata])Set attributes on an InferenceData object.
ShiftedBetaGeoModel.table(**model_table_kwargs)Get the summary table of the model.
ShiftedBetaGeoModel.thin_fit_result(keep_every)Return a copy of the model with a thinned fit result.
Attributes
cohort_idxCohort indices for each customer.
cohortsUnique cohort values from data.
default_model_configDefault model configuration.
default_sampler_configDefault sampler configuration.
dropout_covariate_colsDropout covariate column names from model_config.
fit_resultGet the posterior fit_result.
idGenerate a unique hash value for the model.
posteriorAccess the 'posterior' attribute of the InferenceData object.
posterior_predictiveAccess the 'posterior_predictive' attribute of the InferenceData object.
predictionsAccess the 'predictions' attribute of the InferenceData object.
priorAccess the 'prior' attribute of the InferenceData object.
prior_predictiveAccess the 'prior_predictive' attribute of the InferenceData object.
versionidatasampler_configmodel_config