Gaussian Mixture¶
Fit a and sample from a uni- bi- or multivariate Gaussian mixture model with diagonal covariance matrices. For the multivariate case the distribution is given by
The mixture model is then composed of a linear combination of an arbitrary number of components \(K\):
Where \(\pi_k\) is the mixing coefficient for the \(k\)-th distribution. \(\mu\), \(\Sigma\) and \(\pi\) are estimated by Maximum-Likelihood for each \(k\). It is possible to specify the number of kernels to define the modality of the distribution and also dimensionality for both \(x\) and \(y\). The component means are initialized randomly according to given standard deviation. Also the weights are initialized randomly.
-
class
cde.density_simulation.
GaussianMixture
(n_kernels=5, ndim_x=1, ndim_y=1, means_std=1.5, random_seed=None)[source]¶ This model allows to fit and sample from a uni- bi- or multivariate Gaussian mixture model with diagonal covariance matrices. The mixture model is composed by a linear combination of an arbitrary number of components n_kernels. Means, covariances and weights are estimated by Maximum-Likelihood for each component. It is possible to specify the number of kernels to define the modality of the distribution and also dimensionality for both x and y. The component means are initialized randomly according to given standard deviation. Also the weights are initialized randomly.
- Parameters
n_kernels – number of mixture components
ndim_x – dimensionality of X / number of random variables in X
ndim_y – dimensionality of Y / number of random variables in Y
means_std – std. dev. when sampling the kernel means
random_seed – seed for the random_number generator
-
can_sample
= None¶ set parameters, calculate weights, means and covariances
-
cdf
(X, Y)[source]¶ - conditional cumulative probability density function P(Y<y|X=x).
See “Conditional Gaussian Mixture Models for Environmental Risk Mapping” [Gilardi, Bengio] for the math.
- Parameters
X – the position/conditional variable for the distribution P(Y<y|X=x), array_like, shape:(n_samples, ndim_x)
Y – the on X conditioned variable Y, array_like, shape:(n_samples, ndim_y)
- Returns
(n_samples,)
- Return type
the cond. cumulative distribution of Y given X, for the given realizations of X with shape
-
conditional_value_at_risk
(x_cond, alpha=0.01, n_samples=1000000)¶ Computes the Conditional Value-at-Risk (CVaR) / Expected Shortfall of the fitted distribution. Only if ndim_y = 1
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
alpha – quantile percentage of the distribution
n_samples – number of samples for monte carlo model_fitting
- Returns
CVaR values for each x to condition on - numpy array of shape (n_values)
-
covariance
(x_cond, n_samples=None)[source]¶ Covariance of the distribution conditioned on x_cond
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
- Returns
Covariances Cov[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)
-
covariances_y
= None¶ some eigenvalues of the sampled covariance matrices can be exactly zero -> map to positive semi-definite subspace
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
joint_pdf
(X, Y)[source]¶ joint probability density function P(X, Y)
- Parameters
X – variable X for the distribution P(X, Y), array_like, shape:(n_samples, ndim_x)
Y – variable Y for the distribution P(X, Y) array_like, shape:(n_samples, ndim_y)
- Returns
(n_samples,)
- Return type
the joint distribution of X and Y wih shape
-
kurtosis
(x_cond, n_samples=1000000)¶ Kurtosis of the fitted distribution conditioned on x_cond
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
- Returns
Kurtosis Kurt[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)
-
log_pdf
(X, Y)¶ Conditional log-probability log p(y|x). Requires the model to be fitted.
- Parameters
X – numpy array to be conditioned on - shape: (n_samples, n_dim_x)
Y – numpy array of y targets - shape: (n_samples, n_dim_y)
- Returns
conditional log-probability log p(y|x) - numpy array of shape (n_query_samples, )
-
mean_
(x_cond, n_samples=None)[source]¶ Conditional mean of the distribution :param x_cond: different x values to condition on - numpy array of shape (n_values, ndim_x)
- Returns
Means E[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y)
-
means
= None¶ Sample cov matrixes and assure that cov matrix is pos definite
-
pdf
(X, Y)[source]¶ - conditional probability density function P(Y|X)
See “Conditional Gaussian Mixture Models for Environmental Risk Mapping” [Gilardi, Bengio] for the math.
- Parameters
X – the position/conditional variable for the distribution P(Y|X), array_like, shape:(n_samples, ndim_x)
Y – the on X conditioned variable Y, array_like, shape:(n_samples, ndim_y)
- Returns
(n_samples,)
- Return type
the cond. distribution of Y given X, for the given realizations of X with shape
-
plot
(xlim=(-5, 5), ylim=(-5, 5), resolution=100, mode='pdf', show=False, numpyfig=False)¶ Plots the distribution specified in mode if x and y are 1-dimensional each
- Parameters
xlim – 2-tuple specifying the x axis limits
ylim – 2-tuple specifying the y axis limits
resolution – integer specifying the resolution of plot
mode – spefify which dist to plot [“pdf”, “cdf”, “joint_pdf”]
-
plot2d
(x_cond=[0, 1, 2], ylim=(-8, 8), resolution=100, mode='pdf', show=True, prefix='', numpyfig=False)¶ Generates a 3d surface plot of the fitted conditional distribution if x and y are 1-dimensional each
- Parameters
xlim – 2-tuple specifying the x axis limits
ylim – 2-tuple specifying the y axis limits
resolution – integer specifying the resolution of plot
-
plot3d
(xlim=(-5, 5), ylim=(-8, 8), resolution=100, show=False, numpyfig=False)¶ Generates a 3d surface plot of the fitted conditional distribution if x and y are 1-dimensional each
- Parameters
xlim – 2-tuple specifying the x axis limits
ylim – 2-tuple specifying the y axis limits
resolution – integer specifying the resolution of plot
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
- Return type
self
-
simulate
(n_samples=1000)[source]¶ Draws random samples from the unconditional distribution p(x,y)
- Parameters
n_samples – (int) number of samples to be drawn from the conditional distribution
- Returns
(X,Y) - random samples drawn from p(x,y) - numpy arrays of shape (n_samples, ndim_x) and (n_samples, ndim_y)
-
simulate_conditional
(X)[source]¶ Draws random samples from the conditional distribution
- Parameters
X – x to be conditioned on when drawing a sample from y ~ p(y|x) - numpy array of shape (n_samples, ndim_x)
- Returns
Conditional random samples y drawn from p(y|x) - numpy array of shape (n_samples, ndim_y)
-
skewness
(x_cond, n_samples=1000000)¶ Skewness of the fitted distribution conditioned on x_cond
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
- Returns
Skewness Skew[y|x] corresponding to x_cond - numpy array of shape (n_values, ndim_y, ndim_y)
-
std_
(x_cond, n_samples=1000000)¶ Standard deviation of the fitted distribution conditioned on x_cond
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
- Returns
Standard deviations sqrt(Var[y|x]) corresponding to x_cond - numpy array of shape (n_values, ndim_y)
-
tail_risk_measures
(x_cond, alpha=0.01, n_samples=1000000)¶ Computes the Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR)
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
alpha – quantile percentage of the distribution
n_samples – number of samples for monte carlo model_fitting
- Returns
VaR values for each x to condition on - numpy array of shape (n_values)
CVaR values for each x to condition on - numpy array of shape (n_values)
-
value_at_risk
(x_cond, alpha=0.01, n_samples=1000000)¶ Computes the Value-at-Risk (VaR) of the fitted distribution. Only if ndim_y = 1
- Parameters
x_cond – different x values to condition on - numpy array of shape (n_values, ndim_x)
alpha – quantile percentage of the distribution
n_samples – number of samples for monte carlo model_fitting
- Returns
VaR values for each x to condition on - numpy array of shape (n_values)