BetaGeoBetaBinomModel#
- class pymc_marketing.clv.models.beta_geo_beta_binom.BetaGeoBetaBinomModel(data, *, model_config=None, sampler_config=None)[source]#
Beta-Geometric/Beta-Binomial Model (BG/BB).
CLV model for non-contractual, discrete purchase opportunities, introduced by Fadel et al. [1].
The BG/BB model assumes the probability a customer will become inactive follows a Beta distribution, and the probability of making a purchase is also Beta-distributed while customers are still active.
This model requires data to be summarized by recency, frequency, and T for each customer. T should be the same value across all customers.
- Parameters:
- data
DataFrame
DataFrame containing the following columns:
customer_id
: Unique customer identifierfrequency
: Number of repeat purchasesrecency
: Purchase opportunities between the first and the last purchaseT
: Total purchase opportunities. Model assumptions require T >= recency and all customers share the same value for *T.
- model_config
dict
, optional Dictionary containing model parameters:
alpha_prior
: Shape parameter of purchase process; defaults tophi_purchase_prior
*kappa_purchase_prior
beta_prior
: Shape parameter of purchase process; defaults to1-phi_purchase_prior
*kappa_purchase_prior
gamma_prior
: Shape parameter of dropout process; defaults tophi_purchase_prior
*kappa_purchase_prior
delta_prior
: Shape parameter of dropout process; defaults to1-phi_dropout_prior
*kappa_dropout_prior
phi_purchase_prior
: Nested prior for alpha and beta priors; defaults toPrior("Uniform", lower=0, upper=1)
kappa_purchase_prior
: Nested prior for alpha and beta priors; defaults toPrior("Pareto", alpha=1, m=1)
phi_dropout_prior
: Nested prior for gamma and delta priors; defaults toPrior("Uniform", lower=0, upper=1)
kappa_dropout_prior
: Nested prior for gamma and delta priors; defaults toPrior("Pareto", alpha=1, m=1)
If not provided, the model will use default priors specified in the
default_model_config
class attribute.- sampler_config
dict
, optional Dictionary of sampler parameters. Defaults to None.
- data
References
[1]Peter Fader, Bruce Hardie, and Jen Shang. “Customer-Base Analysis in a Discrete-Time Noncontractual Setting”. Marketing Science, Vol. 29, No. 6 (Nov-Dec, 2010), pp. 1086-1108. https://www.brucehardie.com/papers/020/fader_et_al_mksc_10.pdf
Examples
import pymc as pm from pymc_marketing.prior import Prior from pymc_marketing.clv import BetaGeoBetaBinomModel rfm_df = rfm_summary(raw_data,'id_col_name','date_col_name') # Initialize model with customer data; `model_config` parameter is optional model = BetaGeoBetaBinomModel( data=rfm_df, model_config={ "alpha_prior": Prior("HalfFlat"), "beta_prior": Prior("HalfFlat"), "gamma_prior": Prior("HalfFlat"), "delta_prior": Prior("HalfFlat"), }, ) # Fit model quickly to large datasets via Maximum a Posteriori model.fit(fit_method='map') print(model.fit_summary()) # Fit with the default 'mcmc' for more informative predictions and reliable performance on smaller datasets model.fit(fit_method='mcmc') print(model.fit_summary()) # Predict number of purchases for customers over the next 10 time periods expected_purchases = model.expected_purchases( data=rfm_df, future_t=10, ) # Predict probability of customer making 'n' purchases over 't' time periods # Data parameter is omitted here because predictions are ran on original dataset expected_num_purchases = model.expected_purchase_probability( n=[0, 1, 2, 3], future_t=[10,20,30,40], ) new_data = pd.DataFrame( data = { "customer_id": [0, 1, 2, 3], "frequency": [5, 2, 1, 8], "recency": [7, 4, 2.5, 11], "T": [10, 8, 10, 22] } ) # Predict probability customers will still be active in 'future_t' time periods probability_alive = model.expected_probability_alive( data=new_data, future_t=[0, 3, 6, 9], ) # Predict number of purchases for a new customer over 't' time periods. expected_purchases_new_customer = model.expected_purchases_new_customer( t=[2, 5, 7, 10], )
Methods
BetaGeoBetaBinomModel.__init__
(data, *[, ...])Initialize model configuration and sampler configuration for the model.
Convert the model configuration and sampler configuration from the attributes to keyword arguments.
Build model from the InferenceData object.
Build the model.
Create the fit_data group based on the input data.
Create attributes for the inference data.
BetaGeoBetaBinomModel.distribution_new_customer_dropout
([...])Sample from the Beta distribution representing dropout probabilities for new customers.
BetaGeoBetaBinomModel.distribution_new_customer_purchase_rate
([...])Sample from the Beta distribution representing purchase probabilities for new customers.
BetaGeoBetaBinomModel.distribution_new_customer_recency_frequency
([...])BG/BB process representing purchases across the customer population.
Predict expected probability of being alive.
Predict expected number of future purchases.
BetaGeoBetaBinomModel.expected_purchases_new_customer
([...])Predict the expected number of purchases for a new customer across t time periods.
BetaGeoBetaBinomModel.fit
([fit_method])Infer model posterior.
BetaGeoBetaBinomModel.fit_summary
(**kwargs)Compute the summary of the fit result.
BetaGeoBetaBinomModel.graphviz
(**kwargs)Get the graphviz representation of the model.
BetaGeoBetaBinomModel.load
(fname)Create a ModelBuilder instance from a file.
Create a ModelBuilder instance from an InferenceData object.
BetaGeoBetaBinomModel.predict
([X, extend_idata])Use a model to predict on unseen data and return point prediction of all the samples.
BetaGeoBetaBinomModel.predict_posterior
([X, ...])Generate posterior predictive samples on unseen data.
BetaGeoBetaBinomModel.predict_proba
([X, ...])Alias for
predict_posterior
, for consistency with scikit-learn probabilistic estimators.Sample from the model's posterior predictive distribution.
Sample from the model's prior predictive distribution.
BetaGeoBetaBinomModel.save
(fname)Save the model's inference data to a file.
Set attributes on an InferenceData object.
BetaGeoBetaBinomModel.thin_fit_result
(keep_every)Return a copy of the model with a thinned fit result.
Attributes
X
default_model_config
Default model configuration.
default_sampler_config
Default sampler configuration.
fit_result
Get the posterior fit_result.
id
Generate a unique hash value for the model.
output_var
Output variable of the model.
posterior
posterior_predictive
predictions
prior
prior_predictive
version
y