BetaGeoBetaBinomModel#

class pymc_marketing.clv.models.beta_geo_beta_binom.BetaGeoBetaBinomModel(data, *, model_config=None, sampler_config=None)[source]#

Beta-Geometric/Beta-Binomial Model (BG/BB).

CLV model for non-contractual, discrete purchase opportunities, introduced by Fadel et al. [1].

The BG/BB model assumes the probability a customer will become inactive follows a Beta distribution, and the probability of making a purchase is also Beta-distributed while customers are still active.

This model requires data to be summarized by recency, frequency, and T for each customer. T should be the same value across all customers.

Parameters:
dataDataFrame

DataFrame containing the following columns:

  • customer_id: Unique customer identifier

  • frequency: Number of repeat purchases

  • recency: Purchase opportunities between the first and the last purchase

  • T: Total purchase opportunities. Model assumptions require T >= recency and all customers share the same value for *T.

model_configdict, optional

Dictionary containing model parameters:

  • alpha_prior: Shape parameter of purchase process; defaults to phi_purchase_prior * kappa_purchase_prior

  • beta_prior: Shape parameter of purchase process; defaults to 1-phi_purchase_prior * kappa_purchase_prior

  • gamma_prior: Shape parameter of dropout process; defaults to phi_purchase_prior * kappa_purchase_prior

  • delta_prior: Shape parameter of dropout process; defaults to 1-phi_dropout_prior * kappa_dropout_prior

  • phi_purchase_prior: Nested prior for alpha and beta priors; defaults to Prior("Uniform", lower=0, upper=1)

  • kappa_purchase_prior: Nested prior for alpha and beta priors; defaults to Prior("Pareto", alpha=1, m=1)

  • phi_dropout_prior: Nested prior for gamma and delta priors; defaults to Prior("Uniform", lower=0, upper=1)

  • kappa_dropout_prior: Nested prior for gamma and delta priors; defaults to Prior("Pareto", alpha=1, m=1)

If not provided, the model will use default priors specified in the default_model_config class attribute.

sampler_configdict, optional

Dictionary of sampler parameters. Defaults to None.

References

[1]

Peter Fader, Bruce Hardie, and Jen Shang. “Customer-Base Analysis in a Discrete-Time Noncontractual Setting”. Marketing Science, Vol. 29, No. 6 (Nov-Dec, 2010), pp. 1086-1108. https://www.brucehardie.com/papers/020/fader_et_al_mksc_10.pdf

Examples

import pymc as pm

from pymc_marketing.prior import Prior
from pymc_marketing.clv import BetaGeoBetaBinomModel

rfm_df = rfm_summary(raw_data,'id_col_name','date_col_name')

# Initialize model with customer data; `model_config` parameter is optional
model = BetaGeoBetaBinomModel(
    data=rfm_df,
    model_config={
        "alpha_prior": Prior("HalfFlat"),
        "beta_prior": Prior("HalfFlat"),
        "gamma_prior": Prior("HalfFlat"),
        "delta_prior": Prior("HalfFlat"),
    },
)

# Fit model quickly to large datasets via Maximum a Posteriori
model.fit(fit_method='map')
print(model.fit_summary())

# Fit with the default 'mcmc' for more informative predictions and reliable performance on smaller datasets
model.fit(fit_method='mcmc')
print(model.fit_summary())

# Predict number of purchases for customers over the next 10 time periods
expected_purchases = model.expected_purchases(
    data=rfm_df,
    future_t=10,
)

# Predict probability of customer making 'n' purchases over 't' time periods
# Data parameter is omitted here because predictions are ran on original dataset
expected_num_purchases = model.expected_purchase_probability(
    n=[0, 1, 2, 3],
    future_t=[10,20,30,40],
)

new_data = pd.DataFrame(
    data = {
    "customer_id": [0, 1, 2, 3],
    "frequency": [5, 2, 1, 8],
    "recency": [7, 4, 2.5, 11],
    "T": [10, 8, 10, 22]
    }
)

# Predict probability customers will still be active in 'future_t' time periods
probability_alive = model.expected_probability_alive(
    data=new_data,
    future_t=[0, 3, 6, 9],
)

# Predict number of purchases for a new customer over 't' time periods.
expected_purchases_new_customer = model.expected_purchases_new_customer(
    t=[2, 5, 7, 10],
)

Methods

BetaGeoBetaBinomModel.__init__(data, *[, ...])

Initialize model configuration and sampler configuration for the model.

BetaGeoBetaBinomModel.attrs_to_init_kwargs(attrs)

Convert the model configuration and sampler configuration from the attributes to keyword arguments.

BetaGeoBetaBinomModel.build_from_idata(idata)

Build model from the InferenceData object.

BetaGeoBetaBinomModel.build_model()

Build the model.

BetaGeoBetaBinomModel.create_fit_data(X, y)

Create the fit_data group based on the input data.

BetaGeoBetaBinomModel.create_idata_attrs()

Create attributes for the inference data.

BetaGeoBetaBinomModel.distribution_new_customer_dropout([...])

Sample from the Beta distribution representing dropout probabilities for new customers.

BetaGeoBetaBinomModel.distribution_new_customer_purchase_rate([...])

Sample from the Beta distribution representing purchase probabilities for new customers.

BetaGeoBetaBinomModel.distribution_new_customer_recency_frequency([...])

BG/BB process representing purchases across the customer population.

BetaGeoBetaBinomModel.expected_probability_alive([...])

Predict expected probability of being alive.

BetaGeoBetaBinomModel.expected_purchases([...])

Predict expected number of future purchases.

BetaGeoBetaBinomModel.expected_purchases_new_customer([...])

Predict the expected number of purchases for a new customer across t time periods.

BetaGeoBetaBinomModel.fit([fit_method])

Infer model posterior.

BetaGeoBetaBinomModel.fit_summary(**kwargs)

Compute the summary of the fit result.

BetaGeoBetaBinomModel.graphviz(**kwargs)

Get the graphviz representation of the model.

BetaGeoBetaBinomModel.load(fname)

Create a ModelBuilder instance from a file.

BetaGeoBetaBinomModel.load_from_idata(idata)

Create a ModelBuilder instance from an InferenceData object.

BetaGeoBetaBinomModel.predict([X, extend_idata])

Use a model to predict on unseen data and return point prediction of all the samples.

BetaGeoBetaBinomModel.predict_posterior([X, ...])

Generate posterior predictive samples on unseen data.

BetaGeoBetaBinomModel.predict_proba([X, ...])

Alias for predict_posterior, for consistency with scikit-learn probabilistic estimators.

BetaGeoBetaBinomModel.sample_posterior_predictive([...])

Sample from the model's posterior predictive distribution.

BetaGeoBetaBinomModel.sample_prior_predictive([...])

Sample from the model's prior predictive distribution.

BetaGeoBetaBinomModel.save(fname)

Save the model's inference data to a file.

BetaGeoBetaBinomModel.set_idata_attrs([idata])

Set attributes on an InferenceData object.

BetaGeoBetaBinomModel.thin_fit_result(keep_every)

Return a copy of the model with a thinned fit result.

Attributes

X

default_model_config

Default model configuration.

default_sampler_config

Default sampler configuration.

fit_result

Get the posterior fit_result.

id

Generate a unique hash value for the model.

output_var

Output variable of the model.

posterior

posterior_predictive

predictions

prior

prior_predictive

version

y