Function Repository Resource:

LatinHypercubeSample

Source Notebook

Sample a product distribution pseudo randomly in a way that guarantees good coverage of all marginals

Contributed by: Sjoerd Smit

ResourceFunction["LatinHypercubeSample"][dim,n]

generates n hypercube samples from ProductDistribution[{UniformDistribution[],dim}].

ResourceFunction["LatinHypercubeSample"][dist,n]

generates n samples from dist that cover the probability space of the distribution equally.

ResourceFunction["LatinHypercubeSample"][{dist₁,dist₂,…},n]

generates n hypercube samples from ProductDistribution[dist₁,dist₂,…].

ResourceFunction["LatinHypercubeSample"][spec, n, m]

repeats the sampling m times, generating n×m samples.

Details

Latin hypercube sampling (LHS) is a technique for sampling multi-dimensional distribution that assumes that the features to be sampled are independent of each other. For each feature it divides its marginal probability space into n equally spaced intervals and samples each interval once. The tuples are then randomly constructed from these samples. For example, in a 2D distribution, 5 samples could be picked according to the following grid where each black square represents a sampling region. Note that none of the squares share a row or column with any of the other ones and that every row and every column has exactly one square:

In[1]:=

$\!$\* GraphicsBox[ RasterBox[SparseArray[ Automatic, {5, 5}, 1., {1, {{0, 1, 2, 3, 4, 5}, {{1}, {3}, {5}, {4}, {2}}}, {0., 0., 0., 0., 0.}}], {{0, 0}, {5, 5}}, {0, 1}], Frame->Automatic, FrameLabel->{None, None}, FrameTicks->{{None, None}, {None, None}}, GridLinesStyle->Directive[ GrayLevel[0.5, 0.4]], ImageSize->{178.5, Automatic}, Method->{"GridLinesInFront" -> True, "DefaultBoundaryStyle" -> Automatic, "DefaultGraphicsInteraction" -> {"Version" -> 1.2, "TrackMousePosition" -> {True, False}, "Effects" -> {"Highlight" -> {"ratio" -> 2}, "HighlightPoint" -> {"ratio" -> 2}, "Droplines" -> {"freeformCursorMode" -> True, "placement" -> {"x" -> "All", "y" -> "None"}}}}, "DefaultPlotStyle" -> Automatic}]$$

When sampling from distributions, each distribution should be a univariate distribution that can be sampled and for which the InverseCDF can be computed.

Latin hypercube sampling is a popular technique in Design of Experiments (DOE) because it explores the edges and corners of high-dimensional feature spaces more efficiently than completely random sampling (Monte Carlo).

Examples

Basic Examples (2)

Generate 10 random numbers between 0 and 1 that are roughly equally spaced:

In[1]:=

Out[1]=

In[2]:=

Out[2]=

In the case of 10 samples, each leading digit will always get exactly one sample:

In[3]:=

Out[3]=

Repeat the sampling 10 times to generate 100 samples:

In[4]:=

Out[4]=

Generate pairs of uniformly distributed numbers on the unit square:

In[5]:=

Out[5]=

Scope (2)

Generate 10 samples from NormalDistribution[] that are equally distributed in probability space:

In[6]:=

Out[6]=

Demonstrate the equal spacing:

In[7]:=

Out[7]=

Due the use of equal spacing of the selection regions, samples tend to align more closely to the reference distribution in a ProbabilityPlot than completely random samples:

In[8]:=