Conditional Density Estimators

Base Density Estimation Interface

class cde.density_estimator.BaseDensityEstimator[source]

Interface for conditional density estimation models

conditional_value_at_risk(x_cond, alpha=0.01, n_samples=1000000)[source]

Computes the Conditional Value-at-Risk (CVaR) / Expected Shortfall of the fitted distribution. Only if ndim_y = 1

Parameters
  • x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

  • alpha – quantile percentage of the distribution

Returns

CVaR values for each x to condition on - numpy array of shape (n_values)

covariance(x_cond, n_samples=1000000)[source]

Covariance of the fitted distribution conditioned on x_cond

Parameters

x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Covariances Cov[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)

eval_by_cv(X, Y, n_splits=5, verbose=True)[source]

Fits the conditional density model with cross-validation by using the score function of the BaseDensityEstimator for scoring the various splits.

Parameters
  • X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_samples, n_dim_y)

  • n_splits – number of cross-validation folds (positive integer)

  • verbose – the verbosity level

fit(X, Y, verbose=False)[source]

Fits the conditional density model with provided data

Parameters
  • X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_samples, n_dim_y)

fit_by_cv(X, Y, n_folds=3, param_grid=None, verbose=True, n_jobs=-1)[source]

Fits the conditional density model with hyperparameter search and cross-validation. - Determines the best hyperparameter configuration from a pre-defined set using cross-validation. Thereby,

the conditional log-likelihood is used for simulation_eval.

  • Fits the model with the previously selected hyperparameter configuration

Parameters
  • X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_samples, n_dim_y)

  • n_folds – number of cross-validation folds (positive integer)

  • param_grid

    (optional) a dictionary with the hyperparameters of the model as key and and a list of respective parametrizations as value. The hyperparameter search is performed over the cartesian product of the provided lists. Example: {“n_centers”: [20, 50, 100, 200],

    ”center_sampling_method”: [“agglomerative”, “k_means”, “random”], “keep_edges”: [True, False]

    }

get_configuration(deep=True)[source]

Get parameter configuration for this estimator.

Parameters

deep – boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params - mapping of string to any Parameter names mapped to their values.

kurtosis(x_cond, n_samples=1000000)[source]

Kurtosis of the fitted distribution conditioned on x_cond

Parameters

x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Kurtosis Kurt[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)

log_pdf(X, Y)[source]

Predicts the conditional log-probability log p(y|x). Requires the model to be fitted.

Parameters
  • X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_samples, n_dim_y)

Returns

conditional log-probability log p(y|x) - numpy array of shape (n_query_samples, )

mean_(x_cond, n_samples=1000000)[source]

Mean of the fitted distribution conditioned on x_cond :param x_cond: different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Means E[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y)

mean_std(x_cond, n_samples=1000000)[source]
Computes Mean and Covariance of the fitted distribution conditioned on x_cond.

Computationally more efficient than calling mean and covariance computatio separately

Parameters

x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Means E[y|x] and Covariances Cov[y|x]

pdf(X, Y)[source]

Predicts the conditional likelihood p(y|x). Requires the model to be fitted.

Parameters
  • X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_samples, n_dim_y)

Returns

conditional likelihood p(y|x) - numpy array of shape (n_query_samples, )

predict_density(X, Y=None, resolution=50)[source]

Computes conditional density p(y|x) over a predefined grid of y target values

Parameters
  • X – values/vectors to be conditioned on - shape: (n_instances, n_dim_x)

  • Y – (optional) y values to be evaluated from p(y|x) - if not set, Y will be a grid with with specified resolution

  • resulution – integer specifying the resolution of simulation_eval grid

Returns: tuple (P, Y)
  • P - density p(y|x) - shape (n_instances, resolution**n_dim_y)

  • Y - grid with with specified resolution - shape (resolution**n_dim_y, n_dim_y) or a copy of Y in case it was provided as argument

score(X, Y)[source]

Computes the mean conditional log-likelihood of the provided data (X, Y)

Parameters
  • X – numpy array to be conditioned on - shape: (n_query_samples, n_dim_x)

  • Y – numpy array of y targets - shape: (n_query_samples, n_dim_y)

Returns

average log likelihood of data

skewness(x_cond, n_samples=1000000)[source]

Skewness of the fitted distribution conditioned on x_cond

Parameters

x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Skewness Skew[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)

std_(x_cond, n_samples=1000000)[source]

Standard deviation of the fitted distribution conditioned on x_cond

Parameters

x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

Returns

Standard deviations sqrt(Var[y|x]) corresponding to x_cond - numpy array of shape (n_values, ndim_y)

tail_risk_measures(x_cond, alpha=0.01, n_samples=1000000)[source]

Computes the Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR)

Parameters
  • x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

  • alpha – quantile percentage of the distribution

  • n_samples – number of samples for monte carlo model_fitting

Returns

  • VaR values for each x to condition on - numpy array of shape (n_values)

  • CVaR values for each x to condition on - numpy array of shape (n_values)

value_at_risk(x_cond, alpha=0.01, n_samples=1000000)[source]

Computes the Value-at-Risk (VaR) of the fitted distribution. Only if ndim_y = 1

Parameters
  • x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)

  • alpha – quantile percentage of the distribution

Returns

VaR values for each x to condition on - numpy array of shape (n_values)