Wolfram Research

Function Repository Resource:

BayesianLinearRegression

Source Notebook

Perform Bayesian linear regression with conjugate priors

Contributed by: Sjoerd Smit

ResourceFunction["BayesianLinearRegression"][{{var11,var12,,y1},{var21,var22,,y2},},{f1,f2,},{var1,var2,}]

performs a Bayesian linear model fit on the data using basis functions fi with independent variables vari.

ResourceFunction["BayesianLinearRegression"][{{var11,var12,}y1,{var21,var22,}y2,},]

generates the same result.

ResourceFunction["BayesianLinearRegression"][{{var11,var12,},{var21,var22,},}{y1,y2,},]

generates the same result.

Details and Options

The yi in the input data can be either scalar or vector observations.
The number of vari have to match the dimensions of the inputs xi.
ResourceFunction["BayesianLinearRegression"] inherits the IncludeConstantBasis option from DesignMatrix. The default option value is "IncludeConstantBasis"True.
By default, a weakly informative prior is assumed that is still normalizable so that the log-evidence is well-defined. Different priors can be specified with the "PriorParameters" option in the same format as the output of ResourceFunction["BayesianLinearRegression"] (see options section below).
The output of ResourceFunction["BayesianLinearRegression"] is an association containing the following items:
“PriorParameters” Values of the hyperparameters corresponding to the prior distribution of the fit coefficients.
“PosteriorParameters” Values of the hyperparameters corresponding to the posterior distribution of the fit coefficients.
“LogEvidence” The natural log of the marginal likelihood , which is the Bayesian goodness-of-fit measure of the model.
“Prior” Prior distributions.
“Posterior” Posterior distributions.
“Basis The basis functions supplied by the user.
“IndependentVariables” The independent variables supplied by the user.
“Functions” Functions that yield distributions when applied arbitrary hyperparameters. Only included when the "IncludeFunctions" True option is set.
The keys "Prior", "Posterior" and "Functions" contain the following items:
“PredictiveDistribution” Predictive distribution of the data.
“UnderlyingValueDistribution” The distribution of of the fit line of the data.
“RegressionCoefficientDistribution” Marginal distribution of the fit coefficients.
“ErrorDistribution” Marginal distribution of the variance (for 1D regression) or covariance matrix.
The implementation of the formulas is based on the Wikipedia article on multivariate Bayesian linear regression (see link below). Univariate regression (i.e., when the yi are scalars or 1D vectors) is treated as a special case of multivariate regression using the lower-dimensional equivalents of the multivariate and matrix distributions.

Examples

Basic Examples

Generate a Bayesian linear regression on random data:

In[1]:=
ResourceFunction["BayesianLinearRegression"][
 RandomReal[1, {2, 2}], \[FormalX], \[FormalX]]
Out[1]=

Generate test data:

In[2]:=
data = RandomVariate[
   MultinormalDistribution[( {
      {1, 0.7},
      {0.7, 1}
     } )],
   20
   ];
ListPlot[data]
Out[3]=

Fit the data with a 1st order polynomial:

In[4]:=
(linearModel = ResourceFunction["BayesianLinearRegression"][
    data, \[FormalX], \[FormalX]]) // Keys
Out[4]=

Show the predictive distribution of the model and the distribution of the fit line (dashed):

In[5]:=
Show[
 Plot[Evaluate@
   InverseCDF[
    linearModel["Posterior", "PredictiveDistribution"], {0.95, 0.5, 0.05}], {\[FormalX], -3, 3}, Filling -> {1 -> {2}, 3 -> {2}}, PlotLegends -> Table[Quantity[i, "Percent"], {i, {95, 50, 5}}]],
 Plot[Evaluate@
   InverseCDF[
    linearModel["Posterior", "UnderlyingValueDistribution"], {0.95, 0.5, 0.05}], {\[FormalX], -3, 3}, Filling -> {1 -> {2}, 3 -> {2}}, PlotStyle -> Dashed],
 ListPlot[data, PlotStyle -> Black],
 PlotRange -> All
 ]
Out[5]=

Plot the joint distribution of the coefficients a and b in the regression equation y==ax+b:

In[6]:=
With[{
  coefficientDist = linearModel["Posterior", "RegressionCoefficientDistribution"]
  },
 ContourPlot[
  Evaluate[PDF[coefficientDist, {a, b}]],
  {a, -1, 1},
  {b, 0, 2},
  PlotRange -> {0, All}, PlotPoints -> 20, FrameLabel -> {"a", "b"}, ImageSize -> 400,
  PlotLabel -> "Posterior PDF of regression coefficients"
  ]
 ]
Out[6]=

Plot the pdf of the posterior variance of the residuals:

In[7]:=
With[{
  errDist = linearModel["Posterior", "ErrorDistribution"]
  },
 Quiet@Plot[
   Evaluate[PDF[errDist, e]],
   {e, 0, 1},
   PlotRange -> {0, All}, AxesLabel -> {"\!\(\*SuperscriptBox[\(\[Sigma]\), \(2\)]\)", "Density"}, ImageSize -> 400,
   Filling -> Axis,
   PlotLabel -> "Posterior PDF of residual variance"
   ]
 ]
Out[7]=

Fit the data with a polynomial of arbitrary degree and compare the prediction bands and log-evidence of fits up to degree 4:

In[8]:=
Table[
 Module[{
   model = ResourceFunction["BayesianLinearRegression"][
     Rule @@@ data, \[FormalX]^Range[0, degree], \[FormalX]],
   predictiveDist
   },
  predictiveDist = model["Posterior", "PredictiveDistribution"];
  Show[
   Plot[Evaluate@
     InverseCDF[predictiveDist, {0.95, 0.5, 0.05}], {\[FormalX], -3, 3}, Filling -> {1 -> {2}, 3 -> {2}}, PlotLegends -> Table[Quantity[i, "Percent"], {i, {95, 50, 5}}]],
   ListPlot[data, PlotStyle -> Black],
   PlotRange -> All,
   PlotLabel -> StringForm["Degree: `1`\nLog evidence: `2`", degree, model["LogEvidence"] ]
   ]
  ],
 {degree, 0, 4}
 ]
Out[8]=

Scope

BayesianLinearRegression can do multivariate regression. For example, generate some 3D data:

In[9]:=
data = RandomVariate[
   MultinormalDistribution[{0, 0, 2}, RandomVariate[
     InverseWishartMatrixDistribution[5, IdentityMatrix[3]]]],
   50
   ];
ListPointPlot3D[data]
Out[10]=

Do a regression where the x coordinate is treated as the independent variable and y and z are both dependent on x:

In[11]:=
multiModel = ResourceFunction["BayesianLinearRegression"][
   data[[All, 1]] -> data[[All, {2, 3}]], {1, \[FormalX]}, \[FormalX]];
multiModel["Posterior"]
Out[12]=

The predictive joint distribution for y and z is a MultivariateTDistribution:

In[13]:=
multiRegressionDist = multiModel["Posterior", "PredictiveDistribution"]
Out[13]=

Visualize the joint distribution as an animation:

In[14]:=
ListAnimate[
 Table[
  ContourPlot[
   Evaluate[PDF[multiRegressionDist, {y, z}]],
   {y, -3, 3},
   {z, -3, 3},
   PlotRange -> All,
   FrameLabel -> {"y", "z"},
   PlotLabel -> StringForm["x = `1`", \[FormalX]],
   PlotPoints -> 20
   ],
  {\[FormalX], -1.5, 1.5, 0.1}
  ],
 AnimationRunning -> False
 ]
Out[14]=

Options

IncludeConstantBasis

Just like LinearModelFit, BayesianLinearRegression constructs a model with bias term by default:

In[15]:=
data = Range[-2, 2] -> Range[-2, 2] + 1 + RandomVariate[NormalDistribution[], 5];
model = ResourceFunction["BayesianLinearRegression"][
   data, \[FormalX], \[FormalX]];
Show[
 Plot[
  Evaluate[
   InverseCDF[
    model["Posterior", "PredictiveDistribution"], {0.95, 0.5, 0.05}]],
  {\[FormalX], -3, 3},
  Filling -> {1 -> {2}, 3 -> {2}}
  ],
 ListPlot[Transpose[List @@ data]]
 ]
Out[16]=

Force the intercept to be zero:

In[17]:=
model2 = ResourceFunction["BayesianLinearRegression"][
   data, \[FormalX], \[FormalX], IncludeConstantBasis -> False];
Show[
 Plot[
  Evaluate[
   InverseCDF[
    model2["Posterior", "PredictiveDistribution"], {0.95, 0.5, 0.05}]],
  {\[FormalX], -3, 3},
  Filling -> {1 -> {2}, 3 -> {2}}
  ],
 ListPlot[Transpose[List @@ data]]
 ]
Out[18]=

“PriorParameters”

The prior can be specified in the same format as the parameter outputs of the Bayesian linear regression. This can be used to update a model with new observations. To illustrate this, generate some test data and divide the dataset into 2 parts:

In[19]:=
data = TakeDrop[ SortBy[First]@RandomVariate[MultinormalDistribution[{0, 1}, ( {
        {1, -1.3},
        {-1.3, 2}
       } )], 20], 10];
plot = ListPlot[data]
Out[20]=

Fit the first half of the data:

In[21]:=
model1 = ResourceFunction["BayesianLinearRegression"][
   Rule @@@ data[[1]], \[FormalX], \[FormalX]];
Show[
 Plot[Evaluate@InverseCDF[
    model1["Posterior", "PredictiveDistribution"],
    {0.95, 0.5, 0.05}
    ], {\[FormalX], -2, 2}, Filling -> {1 -> {2}, 3 -> {2}}],
 plot
 ]
Out[22]=

Use the posterior parameters of model1 as the prior to update the model with the second half of the dataset:

In[23]:=
model2 = ResourceFunction["BayesianLinearRegression"][
   Rule @@@ data[[2]], \[FormalX], \[FormalX], "PriorParameters" -> model1["PosteriorParameters"]];
Show[
 Plot[Evaluate@InverseCDF[
    model2["Posterior", "PredictiveDistribution"],
    {0.95, 0.5, 0.05}
    ], {\[FormalX], -2, 2}, Filling -> {1 -> {2}, 3 -> {2}}],
 plot
 ]
Out[24]=

The result is exactly the same as when you would fit all of the data in a single step:

In[25]:=
model3 = ResourceFunction["BayesianLinearRegression"][
   Rule @@@ (Join @@ data), \[FormalX], \[FormalX]];
Column[{
  model3["PosteriorParameters"],
  model2["PosteriorParameters"]
  }]
Out[26]=

“IncludeFunctions”

When "IncludeFunctions"True, the returned association contains the key "Functions". This key holds the functions that can be used to generate distributions from arbitrary values of hyperparameters:

In[27]:=
data = Range[-2, 2] -> Range[-2, 2] + RandomVariate[NormalDistribution[], 5];
model = ResourceFunction["BayesianLinearRegression"][
   data, \[FormalX], \[FormalX], "IncludeFunctions" -> True];
Keys@model["Functions"]
Out[28]=

The posterior distribution is obtained from the predictive distribution function:

In[29]:=
model["Functions", "PredictiveDistribution"]
Out[29]=

Apply it to the posterior parameters:

In[30]:=
model["Functions", "PredictiveDistribution"]@
  model["PosteriorParameters"] === model["Posterior", "PredictiveDistribution"]
Out[30]=

Applications

Here is a dataset where it is unclear if the fit should be first or second order:

In[31]:=
data = CompressedData["
1:eJxTTMoPSmViYGAQAWIQ/TL6jYAgy4/99sZiC1Rsv+3XnL116usNX/cHqFW4
uwkyH7i2uMCW6/p3e4WY/kNfNc7Y7+jILi1r/mCff1J5/er7T+2dA5d/6fV4
sH/Vl/b9O9sf7dcCq/u3/75eyCe/sywHgBygyE/7R1Ui69wffrIP1uqeubGL
4YDp4/okj4t/9xfrdt6WkfxoPzPsDyOP1wP7wnnnv3LH3dm/+7Qd78pHl/Zz
6D3a1JPxYv8TLuNZdt+e738ANofB4QfI2K837MGUxh37jDQQeGYvN6etwP3+
l/1x/wtWe+a92L9/sfCXgs8f90+c5maZmXx6f1/Mld2/8p/YP3z+9M8az1v2
ITvkWl8HXoCa+8je4Ed/tODVv/u3u0RN3m/5a3/b5H1nwne/38+b1rFml+Gr
/TD7GK6DAmb7fgBLvrFo
"];
plot = ListPlot[data]
Out[32]=

Fit the data with polynomials up to fourth degree, rank the log-evidences and inspect them:

In[33]:=
models = AssociationMap[
   ResourceFunction["BayesianLinearRegression"][
     Rule @@@ data, \[FormalX]^Range[0, #], \[FormalX]] &,
   Range[0, 4]
   ];
ReverseSort@models[[All, "LogEvidence"]]
Out[34]=

Calculate the weights for each fit:

In[35]:=
weights = Normalize[
   (* subtract the max to reduce rounding error *) Exp[models[[All, "LogEvidence"]] - Max[models[[All, "LogEvidence"]]]],
   Total
   ];
ReverseSort[weights]
Out[36]=

Because the weights for the first and second order fits are so close, the correct Bayesian treatment is to include both in the final regression. Define a mixture of posterior predictive distributions:

In[37]:=
mixDist = MixtureDistribution[
   Values[weights],
   Values@models[[All, "Posterior", "PredictiveDistribution"]]
   ];
Short /@ mixDist
Out[38]=

Compare the mixture fit to the individual polynomial regressions:

In[39]:=
TabView[
 KeyValueMap[
  #1 -> Show[
     Plot[
      Evaluate@InverseCDF[#2, {0.95, 0.5, 0.05}],
      {\[FormalX], -5, 5},
      Filling -> {1 -> {2}, 3 -> {2}}, PlotLegends -> Table[Quantity[i, "Percent"], {i, {95, 50, 5}}],
      PlotLabel -> Replace[
        Lookup[weights, #1, None],
        n_?NumericQ :> StringForm["Weight: `1`", n]
        ]
      ],
     plot
     ] &,
  Prepend[
   models[[All, "Posterior", "PredictiveDistribution"]],
   "Mixture" -> mixDist
   ]
  ]
 ]
Out[39]=

Neat Examples

Symbolically calculate the predictive posterior after a single observation for any prior:

In[40]:=
model = ResourceFunction[
   "BayesianLinearRegression"][{xobs -> yobs}, \[FormalX], \[FormalX],
    "PriorParameters" -> <|"B" -> {b1, b2}, "Lambda" -> {{l11^2, r l11 l22}, {r l11 l22, l22}}, "V" -> v, "Nu" -> nu|>];
dist[{xobs_, yobs_}, {b1_, b2_}, {l11_, l22_, r_}, v_, nu_] = Assuming[
  v > 0 && nu > 0 && {xobs, yobs, \[FormalX], b1, b2, l11, l22} \[Element] Reals && -1 < r < 1,
  FullSimplify[model["Posterior", "PredictiveDistribution"]]
  ]
Out[41]=

Visualize the prediction bands:

In[42]:=
Manipulate[
 Plot[
  Evaluate@Thread@Tooltip[
     InverseCDF[
      dist[pt, b, {Sqrt[l11], Sqrt[l22], r}, v, nu], {0.95, 0.5, 0.05}],
     Table[Quantity[i, "Percent"], {i, {95, 50, 5}}]
     ],
  {\[FormalX], -10, 10},
  PlotRange -> {-20, 20},
  ImageSize -> Large,
  Filling -> {1 -> {2}, 3 -> {2}}
  ],
 {{pt, {0, 0}}, {-10, -10}, {10, 10}, ControlType -> Locator},
 {{b, {0, 0}}, {-10, -10}, {10, 10}},
 {{l11, 0.01}, 0.0001, 0.1},
 {{l22, 0.01}, 0.0001, 0.1},
 {{r, 0}, -1, 1},
 {{v, 0.1}, 0.0001, 1},
 {{nu, 0.1}, 0.0001, 1},
 SaveDefinitions -> True
 ]
Out[42]=

Resource History

Source Metadata

See Also

License Information