Wolfram Language Paclet Repository
Community-contributed installable additions to the Wolfram Language
Creation of- and classification with ensembles of classifiers
Contributed by: Anton Antonov
Functions for creating Machine Learning classifier ensembles and making classifications using averaged probabilities, votes, and thresholds.
            
              To install this paclet in your Wolfram Language environment,
              evaluate this code:
              
                PacletInstall["AntonAntonov/ClassifierEnsembles"]
              
            
                To load the code after installation, evaluate this code:
                
                  Needs["AntonAntonov`ClassifierEnsembles`"]
                
              
Get training data:
| In[1]:= | ![]()  | 
Get testing data:
| In[2]:= | ![]()  | 
Summarize the training and testing data:
| In[3]:= | 
| Out[3]= | ![]()  | 
Create a classifier ensemble using Classify method names:
| In[4]:= | ![]()  | 
| Out[5]= | ![]()  | 
Classify a record with the ensemble:
| In[6]:= | 
| Out[6]= | 
Classify a record using classifier votes and specifying 2 to be the threshold for the label "survived":
| In[7]:= | 
| Out[7]= | 
Classify a record using the mean of the probabilities given by each classifier and a threshold for "survived":
| In[8]:= | 
| Out[8]= | 
Classify a list of records and return "survived" if it gets at least two votes:
| In[9]:= | 
| Out[9]= | 
Return "survived" if its average probability is at least 0.7:
| In[10]:= | 
| Out[10]= | 
Get the probabilities:
| In[11]:= | 
| Out[11]= | 
Get the votes:
| In[12]:= | 
| Out[12]= | 
Compute classifier measurements:
| In[13]:= | ![]()  | 
| Out[14]= | 
Compute Receiver Operating Characteristic (ROC) for a range of thresholds for the label "survived" (using the paclet "ROCFunctions"):
| In[15]:= | ![]()  | 
| Out[8]= | ![]()  | 
EnsembleClassifier can take data arguments that Classify can take. Here is a Dataset object for the built-in Titanic data:
| In[16]:= | 
Here we split that Dataset object into two (training and testing):
| In[17]:= | 
| Out[8]= | ![]()  | 
Here we make a classifier ensemble with the training dataset:
| In[18]:= | 
| Out[18]= | ![]()  | 
Here we classify the records of the testing dataset:
| In[19]:= | 
| Out[20]= | 
Confusion matrix can be computed and plotted using functions of the paclet "ROCFunctions". Here is an example:
| In[21]:= | 
| Out[21]= | ![]()  | 
Here is a flowchart that summarizes the classification process (made with Mermaid-JS):
| In[22]:= | ![]()  | 
| Out[22]= | ![]()  |