ModifiedBetaGeoModel#
- class pymc_marketing.clv.models.modified_beta_geo.ModifiedBetaGeoModel(data, model_config=None, sampler_config=None)[source]#
Modified Beta-Geometric Negative Binomial Distribution (MBG/NBD) model for a non-contractual customer population across continuous time.
Based on proposed modifications to the BG/NBD model by Battislam, et al. in [1], and Wagner & Hoppe in[Rd9315d94a886-2]_, which remove the BG/NBD assumption that all non-repeat customers are still active.
The MBG/NBD model assumes dropout probabilities for the customer population are Beta distributed, and time between transactions follows a Gamma distribution while the customer is still active.
This model requires data to be summarized by recency, frequency, and T for each customer, using
clv.utils.rfm_summary()
or equivalent. Modeling assumptions require T >= recency.Predictive methods have been adapted from the ModifiedBetaGeoFitter class in the legacy lifetimes library (see CamDavidsonPilon/lifetimes).
- Parameters:
- data
DataFrame
- DataFrame containing the following columns:
customer_id
: Unique customer identifierfrequency
: Number of repeat purchasesrecency
: Time between the first and the last purchaseT
: Time between the first purchase and the end of the observation period
- model_config
dict
, optional - Dictionary of model prior parameters:
alpha
: Scale parameter for time between purchases; defaults toPrior("HalfFlat")
r
: Shape parameter for time between purchases; defaults toPrior("HalfFlat")
a
: Shape parameter of dropout process; defaults tophi_purchase
*kappa_purchase
b
: Shape parameter of dropout process; defaults to1-phi_dropout
*kappa_dropout
phi_dropout
: Nested prior for a and b priors; defaults toPrior("Uniform", lower=0, upper=1)
kappa_dropout
: Nested prior for a and b priors; defaults toPrior("Pareto", alpha=1, m=1)
- sampler_config
dict
, optional Dictionary of sampler parameters. Defaults to None.
- data
References
[1]Batislam, E.P., M. Denizel, A. Filiztekin (2007), “Empirical validation and comparison of models for customer base analysis.” International Journal of Research in Marketing, 24 (3), 201-209. https://works.bepress.com/meltem-denizel/2/download/
[2]Wagner, U. and Hoppe D. (2008), “Erratum on the MBG/NBD Model,” International Journal of Research in Marketing, 25 (3), 225-226.
Examples
from pymc_marketing.prior import Prior from pymc_marketing.clv import ModifiedBetaGeoModel, rfm_summary # customer identifiers and purchase datetimes # are all that's needed to start modeling data = [ [1, "2024-01-01"], [1, "2024-02-06"], [2, "2024-01-01"], [3, "2024-01-02"], [3, "2024-01-05"], [4, "2024-01-16"], [4, "2024-02-05"], [5, "2024-01-17"], [5, "2024-01-18"], [5, "2024-01-19"], ] raw_data = pd.DataFrame(data, columns=["id", "date"] # preprocess data rfm_df = rfm_summary(raw_data,'id','date') # model_config and sampler_configs are optional model = ModifiedBetaGeoModel( data=data, model_config={ "r": Prior("HalfFlat"), "alpha": Prior("HalfFlat"), "a": Prior("HalfFlat"), "b": Prior("HalfFlat), }, sampler_config={ "draws": 1000, "tune": 1000, "chains": 2, "cores": 2, }, ) # The default 'mcmc' fit_method provides informative predictions # and reliable performance on small datasets model.fit() print(model.fit_summary()) # Maximum a Posteriori can quickly fit a model to large datasets, # but will give limited insights into predictive uncertainty. model.fit(fit_method='map') print(model.fit_summary()) # Predict number of purchases for current customers # over the next 10 time periods expected_purchases = model.expected_purchases(future_t=10) # Predict probability customers are still active probability_alive = model.expected_probability_alive() # Predict number of purchases for a new customer over 't' time periods expected_purchases_new_customer = model.expected_purchases_new_customer(t=10)
Methods
ModifiedBetaGeoModel.__init__
(data[, ...])Initialize model configuration and sampler configuration for the model.
Convert the model configuration and sampler configuration from the attributes to keyword arguments.
Build model from the InferenceData object.
Build the model.
Create the fit_data group based on the input data.
Create attributes for the inference data.
Compute posterior predictive samples of dropout, purchase rate and frequency/recency of new customers.
ModifiedBetaGeoModel.distribution_new_customer_dropout
([...])Sample the Beta distribution for the population-level dropout rate.
ModifiedBetaGeoModel.distribution_new_customer_purchase_rate
([...])Sample the Gamma distribution for the population-level purchase rate.
ModifiedBetaGeoModel.distribution_new_customer_recency_frequency
([...])BG/NBD process representing purchases across the customer population.
Compute the expected number of purchases for a customer.
ModifiedBetaGeoModel.expected_num_purchases_new_customer
(...)Compute the expected number of purchases for a new customer.
Compute the probability a customer with history frequency, recency, and T is currently active.
Probability a customer with frequency, recency, and T will have 0 purchases in the period (T, T+t].
Compute the expected number of future purchases across future_t time periods given recency, frequency, and T for each customer.
Compute the expected number of purchases for a new customer across t time periods.
ModifiedBetaGeoModel.fit
([method, fit_method])Infer model posterior.
ModifiedBetaGeoModel.fit_summary
(**kwargs)Compute the summary of the fit result.
ModifiedBetaGeoModel.graphviz
(**kwargs)Get the graphviz representation of the model.
ModifiedBetaGeoModel.load
(fname)Create a ModelBuilder instance from a file.
Create a ModelBuilder instance from an InferenceData object.
Perform transformation on the model after sampling.
ModifiedBetaGeoModel.predict
([X, extend_idata])Use a model to predict on unseen data and return point prediction of all the samples.
ModifiedBetaGeoModel.predict_posterior
([X, ...])Generate posterior predictive samples on unseen data.
ModifiedBetaGeoModel.predict_proba
([X, ...])Alias for
predict_posterior
, for consistency with scikit-learn probabilistic estimators.Sample from the model's posterior predictive distribution.
Sample from the model's prior predictive distribution.
ModifiedBetaGeoModel.save
(fname)Save the model's inference data to a file.
ModifiedBetaGeoModel.set_idata_attrs
([idata])Set attributes on an InferenceData object.
ModifiedBetaGeoModel.thin_fit_result
(keep_every)Return a copy of the model with a thinned fit result.
Attributes
X
default_model_config
Default model configuration.
default_sampler_config
Default sampler configuration.
fit_result
Get the posterior fit_result.
id
Generate a unique hash value for the model.
output_var
Output variable of the model.
posterior
posterior_predictive
predictions
prior
prior_predictive
version
y