Function Repository Resource:

AggregateSmallest

Source Notebook

Group small values in an association into a single category

Contributed by: Jon McLoone

ResourceFunction["AggregateSmallest"][assoc,x]

sums values below x in Association assoc into a single number.

ResourceFunction["AggregateSmallest"][assoc,Scaled[factor]]

sums small values in Association assoc into a single category no larger than a proportion factor of the total.

Details and Options

The Association must have numeric values such as those given by the output of Counts.
The option "Key" controls the Key to be used for the aggregated data. The default is "Other".

Examples

Basic Examples (2) 

Group all keys with values of 1 or less into a new "Other" key:

In[1]:=
ResourceFunction[
 "AggregateSmallest"][<|"a" -> 10, "b" -> 5, "c" -> 1, "d" -> 0.5|>,
  1]
Out[1]=

Group small values into an "Other" category that contains up to 50% of the total:

In[2]:=
ResourceFunction[
 "AggregateSmallest"][<|"a" -> 10, "b" -> 5, "c" -> 1, "d" -> 0.5|>, Scaled[0.5]]
Out[2]=

Scope (2) 

AggregateSmallest is designed to simplify the result of Counts. For example, this PieChart is unreadable for infrequent categories:

In[3]:=
data = ReverseSort[
  Counts[Select[
    Characters[ExampleData[{"Text", "AliceInWonderland"}]], StringMatchQ[#, LetterCharacter] &]]]; PieChart[data, ChartLabels -> Automatic, BaseStyle -> 14]
Out[3]=

Aggregating 20% of the data makes the pie chart simpler:

In[4]:=
PieChart[ResourceFunction["AggregateSmallest"][data, Scaled[0.20]], ChartLabels -> Automatic, BaseStyle -> 14]
Out[4]=

If only one value meets the criterion, then no change is made:

In[5]:=
PieChart[ResourceFunction[
  "AggregateSmallest"][<|"a" -> 0.2, "b" -> 10|>, Scaled[0.20]], ChartLabels -> Automatic, BaseStyle -> 14]
Out[5]=

Options (1) 

Key (1) 

Use a different name for the replacement key:

In[6]:=
data = ReverseSort[
   Counts[Select[
     Characters[ExampleData[{"Text", "AliceInWonderland"}]], StringMatchQ[#, LetterCharacter] &]]];
PieChart[ResourceFunction["AggregateSmallest"][data, Scaled[0.20], "Key" -> "Other\ncharacters"], ChartLabels -> Automatic, BaseStyle -> 14]
Out[6]=

Publisher

Jon McLoone

Version History

  • 1.0.1 – 01 September 2021
  • 1.0.0 – 15 October 2019

Related Resources

License Information