ParetoNBDModel.distribution_new_customer#

ParetoNBDModel.distribution_new_customer(data=None, *, T=None, random_seed=None, var_names=('dropout', 'purchase_rate', 'recency_frequency'))[source]#

Utility function for posterior predictive sampling of dropout, purchase rate and frequency/recency of new customers.

In a model with covariates, if data is not specified, the dataset used for fitting will be used. A prediction will be computed for a new customer with each set of covariates. This is not a conditional prediction on the observed customers!

Parameters:
  • data (pd.DataFrame, Optional) –

    DataFrame containing the following columns:
    • customer_id: unique customer identifier

    • T: time between the first purchase and the end of the observation period.

    • covariates: Purchase and dropout covariate columns if original model had any.

    If not provided, the method will use the fit dataset.

  • T (array_like, optional) – time between the first purchase and the end of the observation period. Not needed if data parameter is provided with a T column.

  • random_seed (RandomState, optional) – Random state to use for sampling.

  • var_names (Sequence[str]) – Names of the variables to sample from. Defaults to [“dropout”, “purchase_rate”, “recency_frequency”].

Return type:

Dataset