Function Repository Resource:

AggregateSmallest

Source Notebook

Group small values in an association into a single category

Contributed by: Jon McLoone

ResourceFunction["AggregateSmallest"][assoc,x]

sums values below x in Association assoc into a single number.

ResourceFunction["AggregateSmallest"][assoc,Scaled[factor]]

sums small values in Association assoc into a single category no larger than a proportion factor of the total.

ResourceFunction["AggregateSmallest"][{assoc1,assoc2,},x]

sums aggregates each Association into a common categorization.

Details and Options

The Association must have numeric values such as those given by the output of Counts.
The option "Key" controls the Key to be used for the aggregated data. The default is "Other".
The option Method controls how the multiple associations are combined to establish the common "Other" category. Typical values include Mean, Min, Max, First, and Last.

Examples

Basic Examples (3) 

Group all keys with values of 1 or less into a new "Other" key:

In[1]:=
ResourceFunction[
 "AggregateSmallest"][<|"a" -> 10, "b" -> 5, "c" -> 1, "d" -> 0.5|>,
  1]
Out[1]=

Group small values into an "Other" category that contains up to 50% of the total:

In[2]:=
ResourceFunction[
 "AggregateSmallest"][<|"a" -> 10, "b" -> 5, "c" -> 1, "d" -> 0.5|>, Scaled[0.5]]
Out[2]=

When multiple associations are provided a common categorization is established:

In[3]:=
ResourceFunction[
 "AggregateSmallest"][{<|"a" -> 10, "b" -> 5, "c" -> 1, "d" -> 0.5|>, <|"a" -> 1, "b" -> 1, "c" -> 1, "d" -> 1|>}, Scaled[0.5]]
Out[3]=

Scope (2) 

AggregateSmallest is designed to simplify the result of Counts. For example, this PieChart is unreadable for infrequent categories:

In[4]:=
data = ReverseSort[
  Counts[Select[Characters[ExampleData[{"Text", "AliceInWonderland"}]],
     StringMatchQ[#, LetterCharacter] &]]]; PieChart[data, ChartLabels -> Automatic, BaseStyle -> 14]
Out[4]=

Aggregating 20% of the data makes the pie chart simpler:

In[5]:=
PieChart[ResourceFunction["AggregateSmallest"][data, Scaled[0.20]], ChartLabels -> Automatic, BaseStyle -> 14]
Out[5]=

If only one value meets the criterion, then no change is made:

In[6]:=
PieChart[
 ResourceFunction["AggregateSmallest"][<|"a" -> 0.2, "b" -> 10|>, Scaled[0.20]], ChartLabels -> Automatic, BaseStyle -> 14]
Out[6]=

Options (4) 

Key (1) 

Use a different name for the replacement key:

In[7]:=
data = ReverseSort[
   Counts[Select[Characters[ExampleData[{"Text", "AliceInWonderland"}]],
      StringMatchQ[#, LetterCharacter] &]]];
PieChart[
 ResourceFunction["AggregateSmallest"][data, Scaled[0.20], "Key" -> "Other\ncharacters"], ChartLabels -> Automatic, BaseStyle -> 14]
Out[8]=

Method (3) 

When multiple associations are provided the common categorization and order is established using Merge and a function provided by Method. Individual associations may have more or less in the "Other" category than the target scale. By default the Mean value of each key is used:

In[9]:=
ResourceFunction[
 "AggregateSmallest"][{<|"a" -> 1, "b" -> 2, "c" -> 1, "d" -> 1|>, <|
   "a" -> 4, "b" -> 5, "c" -> 1, "d" -> 0.5|>}, Scaled[0.5]]
Out[9]=

Using Method First uses the first available value (IE from the first Association, unless keys are missing) is used to determine the ordering:

In[10]:=
ResourceFunction[
 "AggregateSmallest"][{<|"a" -> 1, "b" -> 2, "c" -> 1, "d" -> 1|>, <|
   "a" -> 4, "b" -> 5, "c" -> 1, "d" -> 0.5|>}, Scaled[0.5], Method -> First]
Out[10]=

Method also accepts an integer and uses the nth Association to control the ordering:

In[11]:=
ResourceFunction[
 "AggregateSmallest"][{<|"a" -> 1, "b" -> 2, "c" -> 1, "d" -> 1|>, <|
   "a" -> 4, "b" -> 5, "c" -> 1, "d" -> 0.5|>}, Scaled[0.5], Method -> 2]
Out[11]=

Publisher

Jon McLoone

Version History

  • 1.1.0 – 07 July 2025
  • 1.0.1 – 01 September 2021
  • 1.0.0 – 15 October 2019

Related Resources

License Information