ModifiedBetaGeoModel#
- class pymc_marketing.clv.models.modified_beta_geo.ModifiedBetaGeoModel(data=None, *, model_config=None, sampler_config=None)[source]#
Modified Beta-Geometric Negative Binomial Distribution (MBG/NBD) model for a non-contractual customer population across continuous time.
Based on proposed modifications to the BG/NBD model by Battislam, et al. in [1], and Wagner & Hoppe in[Rd9315d94a886-2]_, which remove the BG/NBD assumption that all non-repeat customers are still active.
The MBG/NBD model assumes dropout probabilities for the customer population are Beta distributed, and time between transactions follows a Gamma distribution while the customer is still active.
This model requires data to be summarized by recency, frequency, and T for each customer, using
clv.utils.rfm_summary()or equivalent. Modeling assumptions require T >= recency.Predictive methods have been adapted from the ModifiedBetaGeoFitter class in the legacy
lifetimeslibrary (see CamDavidsonPilon/lifetimes).- Parameters:
- data
DataFrame DataFrame containing the following columns:
customer_id: Unique customer identifierfrequency: Number of repeat purchasesrecency: Time between the first and the last purchaseT: Time between the first purchase and the end of the observation period
- model_config
dict, optional Dictionary of model prior parameters:
alpha: Scale parameter for time between purchases; defaults toPrior("HalfFlat")r: Shape parameter for time between purchases; defaults toPrior("HalfFlat")a: Shape parameter of dropout process; defaults tophi_purchase * kappa_purchaseb: Shape parameter of dropout process; defaults to(1 - phi_dropout) * kappa_dropoutphi_dropout: Nested prior for a and b priors; defaults toPrior("Uniform", lower=0, upper=1)kappa_dropout: Nested prior for a and b priors; defaults toPrior("Pareto", alpha=1, m=1)purchase_covariates: Coefficients for purchase rate covariates; defaults toNormal(0, 1)dropout_covariates: Coefficients for dropout covariates; defaults toNormal.dist(0, 1)purchase_covariate_cols: List containing column names of covariates for customer purchase rates.dropout_covariate_cols: List containing column names of covariates for customer dropouts.
- sampler_config
dict, optional Dictionary of sampler parameters. Defaults to None.
- data
References
[1]Batislam, E.P., M. Denizel, A. Filiztekin (2007), “Empirical validation and comparison of models for customer base analysis.” International Journal of Research in Marketing, 24 (3), 201-209. https://works.bepress.com/meltem-denizel/2/download/
[2]Wagner, U. and Hoppe D. (2008), “Erratum on the MBG/NBD Model,” International Journal of Research in Marketing, 25 (3), 225-226. https://www.researchgate.net/profile/Udo-Wagner/publication/274894157_Customer_Base_Analysis_The_Case_for_a_Central_Variant_of_the_BetageometricBND_Model/links/55c3728608aeca747d5f6658/Customer-Base-Analysis-The-Case-for-a-Central-Variant-of-the-Betageometric-BND-Model.pdf
Examples
from pymc_extras.prior import Prior from pymc_marketing.clv import ModifiedBetaGeoModel, rfm_summary # customer identifiers and purchase datetimes # are all that's needed to start modeling data = [ [1, "2024-01-01"], [1, "2024-02-06"], [2, "2024-01-01"], [3, "2024-01-02"], [3, "2024-01-05"], [4, "2024-01-16"], [4, "2024-02-05"], [5, "2024-01-17"], [5, "2024-01-18"], [5, "2024-01-19"], ] raw_data = pd.DataFrame(data, columns=["id", "date"]) # preprocess data rfm_df = rfm_summary(raw_data, "id", "date") # model_config and sampler_configs are optional model = ModifiedBetaGeoModel( model_config={ "r": Prior("HalfFlat"), "alpha": Prior("HalfFlat"), "a": Prior("HalfFlat"), "b": Prior("HalfFlat"), }, sampler_config={ "draws": 1000, "tune": 1000, "chains": 2, "cores": 2, }, ) # The default 'mcmc' fit_method provides informative predictions # and reliable performance on small datasets model.fit(data=rfm_df) print(model.fit_summary()) # Maximum a Posteriori can quickly fit a model to large datasets, # but will give limited insights into predictive uncertainty. model.fit(data=rfm_df, fit_method="map") print(model.fit_summary()) # Predict number of purchases for current customers # over the next 10 time periods expected_purchases = model.expected_purchases(future_t=10) # Predict probability customers are still active probability_alive = model.expected_probability_alive() # Predict number of purchases for a new customer over 't' time periods expected_purchases_new_customer = model.expected_purchases_new_customer(t=10)
Methods
ModifiedBetaGeoModel.__init__([data, ...])Initialize model configuration and sampler configuration for the model.
Convert the model configuration and sampler configuration from the attributes to keyword arguments.
Build the model from the InferenceData object.
ModifiedBetaGeoModel.build_model([data])Build the model.
Create attributes for the inference data.
Compute posterior predictive samples of dropout, purchase rate and frequency/recency of new customers.
ModifiedBetaGeoModel.distribution_new_customer_dropout([...])Sample the Beta distribution for the population-level dropout rate.
ModifiedBetaGeoModel.distribution_new_customer_purchase_rate([...])Sample the Gamma distribution for the population-level purchase rate.
ModifiedBetaGeoModel.distribution_new_customer_recency_frequency([...])BG/NBD process representing purchases across the customer population.
Compute the probability a customer with history frequency, recency, and T is currently active.
Probability a customer with frequency, recency, and T will have 0 purchases in the period (T, T+t].
Compute the expected number of future purchases across future_t time periods given recency, frequency, and T for each customer.
Compute the expected number of purchases for a new customer across t time periods.
ModifiedBetaGeoModel.fit([data, method, ...])Infer model posterior.
ModifiedBetaGeoModel.fit_summary(**kwargs)Compute the summary of the fit result.
ModifiedBetaGeoModel.graphviz(**kwargs)Get the graphviz representation of the model.
Create the initialization kwargs from an InferenceData object.
ModifiedBetaGeoModel.load(fname[, check])Create a ModelBuilder instance from a file.
Create a ModelBuilder instance from an InferenceData object.
ModifiedBetaGeoModel.save(fname, **kwargs)Save the model's inference data to a file.
ModifiedBetaGeoModel.set_idata_attrs([idata])Set attributes on an InferenceData object.
ModifiedBetaGeoModel.table(**model_table_kwargs)Get the summary table of the model.
ModifiedBetaGeoModel.thin_fit_result(keep_every)Return a copy of the model with a thinned fit result.
Attributes
covariate_colsAll covariate column names.
default_model_configDefault model configuration.
default_sampler_configDefault sampler configuration.
dropout_covariate_colsDropout covariate column names from model_config.
fit_resultGet the posterior fit_result.
idGenerate a unique hash value for the model.
posteriorAccess the 'posterior' attribute of the InferenceData object.
posterior_predictiveAccess the 'posterior_predictive' attribute of the InferenceData object.
predictionsAccess the 'predictions' attribute of the InferenceData object.
priorAccess the 'prior' attribute of the InferenceData object.
prior_predictiveAccess the 'prior_predictive' attribute of the InferenceData object.
purchase_covariate_colsPurchase covariate column names from model_config.
versionidatasampler_configmodel_config