Basic Examples
Calculate the KullbackLeiblerDivergence between two NormalDistributions:
KullbackLeiblerDivergence is not symmetric:
KullbackLeiblerDivergence works with symbolic distributions:
The distributions do not have to be from the same family:
It also works with discrete distributions:
Use a custom defined ProbabilityDistribution:
Scope
KullbackLeiblerDivergence works for multivariate distributions:
KullbackLeiblerDivergence also works with EmpiricalDistribution:
Options
Method
Symbolic evaluation of the divergence is unfeasible for some distributions:
Use NExpectation instead:
Supply extra options to NExpectation:
NExpectation is designed to work with distributions with numeric parameters:
Assumptions
Without Assumptions, a message will be raised and conditions are generated in the result:
With assumptions specified, a result valid under those conditions is returned:
Applications
If X and Y are two random variables with a joint distribution 𝒟, then the mutual information between them is defined as the Kullback–Leibler divergence from the product distribution of the marginals to 𝒟.
As an example, calculate the mutual information of the components of a BinormalDistribution:
The Kullback–Leibler divergence can be used to fit distributions to data and also provides a measure of the quality of the fit in a way very similar to maximum likelihood estimation. First generate some samples from a discrete distribution:
Calculate the divergence from the EmpiricalDistribution to a symbolic target distribution:
Minimize with respect to μ:
Try a different distribution to compare:
The minimized divergence is larger, indicating this distribution is a worse approximation to the data:
Note that the reverse divergences are not defined because the supports of PoissonDistribution and GeometricDistribution are infinite:
Also note that this does not work for continuous data because the support of EmpiricalDistribution is discrete:
Use a KernelMixtureDistribution instead for continuous data:
Properties and Relations
The divergence from a distribution to itself is zero:
Possible Issues
The dimensions of the distributions have to match:
Matrix distributions are currently not supported:
The divergence is Undefined if the first distribution has a wider support than the second:
KullbackLeiblerDivergence is undefined between discrete and continuous distributions:
For some symbolic distributions the expectation cannot be evaluated: