Wolfram Research

Function Repository Resource:

TrainTestSplit

Source Notebook

Split data into training and testing sets

Contributed by: Michael Sollami

ResourceFunction["TrainTestSplit"][data]

splits data into a pair of shuffled training and testing sets.

Details and Options

ResourceFunction["TrainTestSplit"] accepts the following options:
"TrainingSetSize" Scaled[0.8] size of the training set
"TestSetSize" Scaled[0.2] size of the testing set
"Shuffle" True whether to shuffle the sets

Examples

Basic Examples

The default test size is 20%:

In[1]:=
ResourceFunction["TrainTestSplit"][# & /@ Range[10]]
Out[1]=

Scope

Specify a non-default test set size as a scaled value:

In[2]:=
ResourceFunction["TrainTestSplit"][# -> EvenQ[#] & /@ Range[10], "TestSetSize" -> Scaled[0.5]]
Out[2]=

Specify a non-default test set size as an explicit value:

In[3]:=
ResourceFunction["TrainTestSplit"][# -> EvenQ[#] & /@ Range[10], "TestSetSize" -> 3]
Out[3]=

Specify a non-default training set size, (a real value is taken as a Scaled):

In[4]:=
ResourceFunction["TrainTestSplit"][# -> EvenQ[#] & /@ Range[10], "TrainingSetSize" -> 0.9]
Out[4]=

Options

By default the samples are shuffled:

In[5]:=
ResourceFunction["TrainTestSplit"][Range[10], "Shuffle" -> False]
Out[5]=

Possible Issues

You must give sensible sizes:

In[6]:=
ResourceFunction["TrainTestSplit"][# -> EvenQ[#] & /@ Range[10], "TrainingSetSize" -> \[Infinity]]
Out[6]=

The option "TrainSize" takes precedence over "TestSize":

In[7]:=
ResourceFunction["TrainTestSplit"][Range[10], "TrainingSetSize" -> 10,
  "TestSetSize" -> 10, "Shuffle" -> False]
Out[7]=

Resource History

Related Resources

License Information