Function Repository Resource:

# MonteCarloSample

Sample from a probability density function using the Markov chain Monte Carlo (MCMC) method

Contributed by: Brian Lu
 ResourceFunction["MonteCarloSample"][f,p,n] generates n samples from probability density function f starting from point p. ResourceFunction["MonteCarloSample"][f,{p1,…,pm},n] generates samples starting from an ensemble consisting of m points. ResourceFunction["MonteCarloSample"][f,{dist,m},n] uses probability distribution dist as the prior and samples m points from dist as the initial points. ResourceFunction["MonteCarloSample"][f,p,{n,burn}] generates samples but discards the first burn samples. ResourceFunction["MonteCarloSample"][f,p,{n,burn,i}] generates samples and takes every ith sample.

## Details and Options

The number of arguments that the probability function f takes should match the dimension of the initial points.
The probability function f does not need to be normalized.
When the initial points are specified as {dist,m}, the domain of the prior distribution dist should have the same dimension as the number of arguments of f and should have a probability distribution function (PDF) in symbolic closed form.
The burn specifies the number of samples to be discarded at the beginning of the chain, where thermalization has not been reached, often referred to as "burn-in". The burn can be specified as Automatic, where the length of thermalization steps are automatically determined from the list of probabilities.
The i is a positive integer step size, in which samples are taken. The i can be specified as Automatic, where it is set to be the inverse of the overall acceptance rate with an upper limit of 10.
The following options are supported:
 Method Automatic the MCMC algorithm to use "LogProbability" False whether the probability function f is specified on a log scale "TargetAcceptanceRate" 0.25 the target rate at which proposed points are accepted for certain methods “ListableFunction" False whether the probability function f is listable Parallelization False whether parallelization is used in sampling
Possible settings for Method include:
 "RandomWalk" uses a fixed probability distribution for drawing proposals "AdaptiveMetropolis" use the method of Haario et al. "EMCEE" use the method of Foreman-Mackey et al. "DifferentialEvolution" use the method of ter Braak
For the method "RandomWalk", the following additional suboption is supported:
 "JumpDistribution" the symmetrical probability distribution for drawing proposals
 "InitialScaleMultiplier" 1 the initial scale of the step relative to the default one "RandomWalkScale" 0.1 the absolute scale of steps when random walk is adopted
For the method "EMCEE", the following additional suboptions are supported:
 "StretchScale" 2 the maximum scale of the stretch moves
For the method "DifferentialEvolution", the following additional suboptions are supported:
 "InitialScaleMultiplier" 2 the initial scale of the step relative to the default one "RandomnessScale" 0.001 the relative scale of randomness of the steps

## Examples

### Basic Examples (5)

Generate 10 samples from a Student t distribution with 3 degrees of freedom:

 In[1]:=
 Out[1]=

Generate 1000 samples, discard the first 200 samples, and take every fifth sample:

 In[2]:=
 Out[2]=
 In[3]:=
 Out[3]=

Do automatic burn-in removal and thinning instead:

 In[4]:=
 Out[4]=
 In[5]:=
 Out[5]=

Sample from a 2D normal distribution starting from an ensemble consisting of 10 random points:

 In[6]:=
 Out[6]=
 In[7]:=
 Out[7]=

Specify a prior in addition to the likelihood function:

 In[8]:=
 Out[8]=
 Out[9]=
 In[10]:=
 Out[10]=
 In[11]:=
 Out[11]=

### Options (8)

#### Method (3)

The random walk method takes jumps following a normal distribution by default. A custom distribution can also be specified:

 In[12]:=
 Out[12]=

The adaptive Metropolis method takes jumps following an adaptive multinormal distribution that is derived from existing samples:

 In[13]:=
 Out[13]=
 In[14]:=
 Out[14]=

The EMCEE and differential evolution methods take jumps being the displacement between two points multiplied by a scale factor. Increasing the number of initial points may improve the performance of these two methods:

 In[15]:=
 Out[15]=
 In[16]:=
 Out[16]=

#### LogProbability (2)

When the probability function evaluates to a tiny number, machine numbers may not be able to represent them accurately and the samples do not converge:

 In[17]:=
 Out[17]=
 In[18]:=
 Out[18]=

"LogProbability"True indicates that the probability function is on a log scale, and the samples converge to x≈50:

 In[19]:=
 Out[19]=
 In[20]:=
 Out[20]=

#### TargetAcceptanceRate (1)

Setting the target acceptance rate is only effective for adaptive Metropolis or differential evolution methods. A higher acceptance rate decreases the step size and vice versa:

 In[21]:=
 Out[21]=
 In[22]:=
 Out[22]=
 In[23]:=
 Out[23]=

#### ListableFunction (1)

If the probability function is listable, specifying "ListableFunction"True and employing a large number of initial points may significantly improve the performance:

 In[24]:=
 Out[24]=
 In[25]:=
 Out[25]=

#### Parallelization (1)

Enabling parallelization is only beneficial when the probability function can be run in parallel and each evaluation takes a long time:

 In[26]:=
 Out[27]=
 In[28]:=
 Out[28]=

### Applications (8)

#### Fit a distribution (4)

Get the posterior distribution of two parameters in a Student t distribution:

 In[29]:=
 Out[29]=

Choose a uniform prior over μ[-2,2], σ[0,3]:

 In[30]:=
 Out[31]=

Run MCMC:

 In[32]:=
 Out[32]=
 In[33]:=
 Out[33]=

The location of the peak in the posterior distribution matches the result given by EstimatedDistribution:

 In[34]:=
 Out[34]=

#### Fit a sinusoidal function (4)

Get the posterior distribution of amplitude, frequency, phase, and mean of a sinusoidal function:

 In[35]:=
 Out[37]=
 In[38]:=
 Out[38]=

The likelihood is derived based on the assumption that each point has a normally distributed error. Define the likelihood, and select the initial points for MCMC around the peak of the likelihood function:

 In[39]:=
 Out[40]=

Run MCMC:

 In[41]:=
 Out[41]=

The marginal distribution of all four parameters compared to their input values:

 In[42]:=
 Out[42]=

### Properties and Relations (1)

The normalization of the probability function does not affect the result:

 In[43]:=
 Out[44]=

### Possible Issues (2)

If all of the initial points are on an hyperplane, all samples will be on the same hyperplane when the EMCEE or differential evolution method is used:

 In[45]:=
 Out[45]=
 In[46]:=
 Out[46]=
 In[47]:=
 Out[47]=

Automatic burn-in removal is done by comparing the log-posteriors with the median log-posterior. In some cases, it may keep samples where thermalization has not been reached:

 In[48]:=
 Out[48]=
 In[49]:=
 Out[49]=
 In[50]:=
 Out[50]=

To mitigate this problem, place the initial points on a wider range:

 In[51]:=
 Out[51]=
 In[52]:=
 Out[52]=

Brian Lu

## Version History

• 1.0.0 – 05 July 2022