Function Repository Resource:

BiPlot

Source Notebook

Visualize the principal components of tabular data

Contributed by: Jon McLoone

ResourceFunction["BiPlot"][data]

generates a ListPlot of the first two principal components of data together with an arrow for each column to indicate the relationship with the principal components.

ResourceFunction["BiPlot"][data, {a,b}]

generates a ListPlot of the a and b principal components of data together with an arrow for each column to indicate the relationship with the principal components a and b.

ResourceFunction["BiPlot"][data, {a,b,c}]

generates a ListPointPlot3D of the a,b and c principal components of data together with a 3D arrow for each column to indicate the relationship with the principal components.

Details and Options

Standard options for ListPlot and ListPointPlot3D are supported:
"ColumnNames"names of data columns, used to label arrows in the plot
"LabelPosition"positioning of arrow labels as a proportion of arrow length
"ArrowStyle"graphics primatives to be applied to the arrows in the plot
ResourceFunction["BiPlot"] uses PrincipalComponents internally to compute principal components.

Examples

Basic Examples (1) 

In this data set we can see from the arrows that "Petal width" and "Petal length" are highly correlated, but "Sepel width" is largely independent from both:

In[1]:=
Short[data = First /@ ExampleData[{"MachineLearning", "FisherIris"}, "Data"]]
Out[1]=
In[2]:=
ResourceFunction["BiPlot"][data, "ColumnNames" -> {"Sepal length", "Sepal\nwidth", "Petal\nlength", "Petal\nwidth"}]
Out[2]=

Scope (3) 

By default the first and second principal values are used, but you can specify different choices:

In[3]:=
ResourceFunction["BiPlot"][data, {2, 3}]
Out[3]=

If you specify three principal values, then the data and arrows are shown in three dimensions:

In[4]:=
ResourceFunction["BiPlot"][data, {1, 2, 3}, "ColumnNames" -> {"a", "b", "c", "d"}]
Out[4]=

If your data is a list of associations or a Dataset of a list of associations, then "ColumnNames" are chosen automatically and the arrows are labeled:

In[5]:=
data2 = Dataset[
  AssociationThread[{"Sepal length", "Sepal\nwidth", "Petal length", "Petal width"}, #] & /@ data]
Out[5]=
In[6]:=
ResourceFunction["BiPlot"][data2]
Out[6]=

Options (3) 

"ColumnNames" can be used to place labels on the arrows:

In[7]:=
ResourceFunction["BiPlot"][data, "ColumnNames" -> {"Sepal length", "Sepal\nwidth", "Petal\nlength", "Petal\nwidth"}]
Out[7]=

"LabelPosition" can be used to position labels along or beyond the arrows:

In[8]:=
ResourceFunction["BiPlot"][data, "ColumnNames" -> {"Sepal length", "Sepal\nwidth", "Petal\nlength", "Petal\nwidth"}, "LabelPosition" -> 0.8]
Out[8]=

"ArrowStyle" can be used to style the arrows:

In[9]:=
ResourceFunction["BiPlot"][data, "ColumnNames" -> {"Sepal length", "Sepal\nwidth", "Petal\nlength", "Petal\nwidth"}, BaseStyle -> "Text", "ArrowStyle" -> {GrayLevel[0.6], Arrowheads[0.02]}]
Out[9]=

Publisher

Jon McLoone

Requirements

Wolfram Language 11.3 (March 2018) or above

Version History

  • 2.0.0 – 15 June 2020
  • 1.0.0 – 22 March 2019

License Information