Function Repository Resource:

PartitionWhile

Source Notebook

Partition a list by a condition on the sublists

Contributed by: Daniele Gregori

ResourceFunction["PartitionWhile"][list,crit]

partitions a list into sublists satisfying a condition.

Details and Options

PartitionWhile splits a collection into sublists comprised of consecutive elements which satisfy a given condition.
The function crit must take a list as input and return True or False.
ResourceFunction["PartitionWhile"][{i1,i2,},crit] splits the data after ik, if crit[{,ik}] is True but crit[{,ik,ik+1}] is False.
The first argument of PartitionWhile can be a List to be partitioned. The second argument is a function which determines the condition which the partitions must satisfy.
The resulting sublists are such that the function evaluates to True on each of them.
In detail, the algorithm implements a double loop for each element of the list and its successive elements. The first subpartition is built by starting to add the list elements after the function first evaluates to True until the function evaluates to False on some successive element. Then the second subpartition is searched in the same way, starting from this last element and the program continues until the end of the list.
No offset parameter is currently implemented and the partitions can be made only at the first Level of the list argument.
By comparison, the core function Partition splits a list into subparts of or UpTo a certain Length.
PartitionWhile accepts the following option:
"ShortestPartitions"Truechange the length of the partitions
Different partitions, always satisfying the given condition, can be determined through different values for this option:
Truesplit always at the Shortest partition
Falsesearch always the Longest partition
ksearch k-1 next elements after the shortest partition

Examples

Basic Examples (2) 

Partition a list of integers by imposing a condition on their Total:

In[1]:=
partitions = ResourceFunction["PartitionWhile"][Range[100], Total[#] <= 50 &]
Out[1]=
In[2]:=
Total /@ partitions
Out[2]=

Partition a list of random strings by a condition on their ByteCount:

In[3]:=
strings = Table[ResourceFunction["RandomString"][RandomInteger[30]], 20]
Out[3]=
In[4]:=
ResourceFunction["PartitionWhile"][strings, ByteCount[#] <= 250 &]
Out[4]=
In[5]:=
ByteCount /@ %
Out[5]=

Scope (4) 

Partition a list of numbers by any logical condition:

In[6]:=
partitions = ResourceFunction["PartitionWhile"][Range[100], Total[#] == 50 || Total[#] == 51 &]
Out[6]=

Verify the totals:

In[7]:=
Total /@ partitions
Out[7]=

Partition a list of natural numbers so each partition adds up to a prime number:

In[8]:=
primepartitions = ResourceFunction["PartitionWhile"][Range[40], PrimeQ[Total[#]] &]
Out[8]=

The totals are all prime:

In[9]:=
Total /@ primepartitions
Out[9]=

Create partitions containing two prime values each:

In[10]:=
ResourceFunction["PartitionWhile"][Range[30], Length@Select[#, PrimeQ] == 2 &]
Out[10]=

Partition a text by a condition on some Characters in it:

In[11]:=
text = ResourceFunction["LatinizedText"][50]
Out[11]=

Select substrings until there are three "a" characters in each:

In[12]:=
StringJoin /@ ResourceFunction["PartitionWhile"][Characters@text, Count[#, "a"] <= 3 &]
Out[12]=
In[13]:=
Count[Characters@#, "a"] & /@ %
Out[13]=

Options (6) 

ShortestPartitions (6) 

By default, the splitting of the original list happens as soon as the Shortest possible partition is found:

In[14]:=
integers = RandomInteger[{-5, 10}, 100]
Out[14]=
In[15]:=
partShortest = ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &]
Out[15]=

It is also possible to search, for each element, the Longest possible partition until the end of the list:

In[16]:=
partLongest = ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &, "ShortestPartitions" -> False]
Out[16]=

However, with this option the partition algorithm becomes essentially quadratic and inefficient on a large scale:

In[17]:=
timeShortest = RepeatedTiming[
   ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &]] // First
Out[17]=
In[18]:=
timeLongest = RepeatedTiming[
   ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &, "ShortestPartitions" -> False]] // First
Out[18]=
In[19]:=
timeLongest/timeShortest
Out[19]=

A possible middle ground can sometimes be found by setting the option "ShortestPartitions" to an integer k >= 2, to search a longer partition k-1 elements after the shortest is found:

In[20]:=
partNext = ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &, "ShortestPartitions" -> 12]
Out[20]=
In[21]:=
timeNext = RepeatedTiming[
   ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &, "ShortestPartitions" -> 12]] // First
Out[21]=

This effectively implements a (Monte Carlo) randomized algorithm, that is often correctly reproducing the longest partitions in an always efficient way:

In[22]:=
partNext == partLongest
Out[22]=
In[23]:=
timeNext/timeShortest
Out[23]=

In any case, the different k-next-to-shortest partitions can be of independent interest:

In[24]:=
Block[{max = 12, lenL, diff},
  lenL = Flatten[Table[
     Comap[{Map[Length, #] &}, ResourceFunction["PartitionWhile"][integers, Total[#] <= 10 &, "ShortestPartitions" -> n]], {n, 1, max}], 1];
  (*Print[Grid@DeleteDuplicates@lenL];*)
  diff = Table[If[
     lenL[[i]] != lenL[[i + 1]], {i + 1, Diff[lenL[[i]], lenL[[i + 1]], "Combined"]}, Nothing], {i, 1, max - 1}];
  Prepend[diff, Style[#, Bold] & /@ {"k-Shortest", "lengths Diff"}]] // Grid
Out[24]=

Applications (4) 

The following list of data (scraped from arXiv) has very irregular element sizes:

In[25]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/65790fb2-b4ca-4b84-bdc4-3093914cfc45"]
In[26]:=
Histogram@Map[ByteCount, arXivData]
Out[26]=

If you try to upload a too large Partition of it unto a Databin, you may exceed the size limits:

In[27]:=
bin = CreateDatabin[]
Out[27]=
In[28]:=
DatabinUpload[bin, Partition[arXivData, UpTo[5]]]
Out[28]=

An optimal partition can be found instead through PartitionWhile[list,ByteCount[#]<=n&]:

In[29]:=
MinMax[ByteCount /@ ResourceFunction["PartitionWhile"][arXivData, ByteCount[#] <= 24000 &]]
Out[29]=
In[30]:=
Length /@ ResourceFunction["PartitionWhile"][arXivData, ByteCount[#] <= 24000 &]
Out[30]=

The DatabinUpload is now successful:

In[31]:=
DatabinUpload[bin, ResourceFunction["PartitionWhile"][arXivData, ByteCount[#] <= 24000 &]]
Out[31]=

Properties and Relations (5) 

PartitionWhile[list,Length[#]<=n&] is equivalent Partition[list,UpTo[n]]:

In[32]:=
integers = Sort@RandomVariate[BenfordDistribution[100], 40]
Out[32]=
In[33]:=
ResourceFunction["PartitionWhile"][integers, Length[#] <= 10 &]
Out[33]=
In[34]:=
Partition[integers, UpTo[10]]
Out[34]=
In[35]:=
% == %%
Out[35]=

PartitionWhile[list,Apply[SameQ,#]&] is equivalent Split[list]:

In[36]:=
characters = {a, a, a, b, b, a, a, c};
In[37]:=
ResourceFunction["PartitionWhile"][characters, Apply[SameQ, #] &]
Out[37]=
In[38]:=
Split[characters]
Out[38]=

In some cases (but see the comments in Possible Issues), PartitionWhile[list,Apply[SameQ,f[#]]&] is equivalent to SplitBy[list,f]:

In[39]:=
rationals = Range[1, 5, 1/2]
Out[39]=
In[40]:=
ResourceFunction["PartitionWhile"][rationals, Apply[SameQ, IntegerPart[#]] &]
Out[40]=
In[41]:=
SplitBy[rationals, IntegerPart]
Out[41]=

In many cases, the output of PartitionWhile[list,fun] is equivalent to that of the built-in SequenceCases, but it implements a significantly faster algorithm:

In[42]:=
text = ResourceFunction["LatinizedText"][50]
Out[42]=
In[43]:=
StringJoin @@@ ResourceFunction["PartitionWhile"][Characters@text, Count[#, "a"] <= 3 &] // RepeatedTiming
Out[43]=
In[44]:=
StringJoin @@@ SequenceCases[
   Characters@text, _?(Count[#, "a"] <= 3 &)] // RepeatedTiming
Out[44]=

In some cases, the output of PartitionWhile[list,fun] is different from SequenceCases, because of a different criterion for sequence splitting:

In[45]:=
ResourceFunction["PartitionWhile"][{1, 2, 3, 4, 5, -20, 6, 12}, Total[#] <= 10 &]
Out[45]=
In[46]:=
SequenceCases[{1, 2, 3, 4, 5, -20, 6, 12}, _?(Total[#] <= 10 &)] /. {} -> Nothing
Out[46]=

The output of SequenceCases can be always recovered by setting the option "ShortestPartitions" to False:

In[47]:=
ResourceFunction["PartitionWhile"][{1, 2, 3, 4, 5, -20, 6, 12}, Total[#] <= 10 &, "ShortestPartitions" -> False]
Out[47]=

Possible Issues (2) 

Only partitions for conditions evaluating to True are returned:

In[48]:=
ResourceFunction["PartitionWhile"][Range[20], Max[#] >= 200 &]
Out[48]=

So if the condition is always evaluated to False, you can use its contrary:

In[49]:=
ResourceFunction["PartitionWhile"][Range[20], ! Max[#] >= 200 &]
Out[49]=

Compare with SplitBy:

In[50]:=
SplitBy[Range[20], Max[#] >= 200 &]
Out[50]=

Similarly, when the condition is evaluated sometimes to True and sometimes to False, only the former cases are returned:

In[51]:=
ResourceFunction["PartitionWhile"][{1, 3/2, 10, 12, 2, 5/2, 3, 11}, AllTrue[#, # >= 10 &] &]
Out[51]=

The different behaviour of SplitBy can then be reproduced as follows:

In[52]:=
ResourceFunction["PartitionWhile"][{1, 3/2, 10, 12, 2, 5/2, 3, 11}, Xor[AllTrue[#, # >= 10 &], AllTrue[#, ! (# >= 10) &]] &]
Out[52]=
In[53]:=
SplitBy[{1, 3/2, 10, 12, 2, 5/2, 3, 11}, # >= 10 &]
Out[53]=

Neat Examples (2) 

Take the low level definition code and generate a much faster compiled version:

In[54]:=
pwlTot[list_] := With[{fun = Total[#] <= 1000 &},
  Block[{len, i, j, first, last, elq, prtind}, len = Length[list]; prtind = {}; For[i = 1, i <= len, i++, first = i;
    				last = 0; For[j = i, j <= len, j++, elq = fun[list[[i ;; j]]];
     					If[elq,
      						last = j,
      						If[last != 0, Break[]]]]; If[last != 0,
     					i = last;
     					AppendTo[prtind, {first, last}]]]; Map[Take[list, #] &, prtind]]]
In[55]:=
pwlTotComp =
 FunctionCompile[Function[ Typed[arg, "NumericArray"::["MachineInteger", 1]], Typed[
     DownValuesFunction[pwlTot],
     {"NumericArray"::["MachineInteger", 1]} -> "NumericArray"::["MachineInteger", 2]]
    [arg]]]
Out[55]=

Compare the performance with the uncompiled version and the built-in SequenceCases:

In[56]:=
naturals = Table[RandomInteger[100], 1000];
In[57]:=
partBlt = SequenceCases[naturals, _?(Total[#] <= 1000 &)]; // RepeatedTiming
Out[57]=
In[58]:=
partDef = ResourceFunction["PartitionWhile"][naturals, Total[#] <= 1000 &]; // RepeatedTiming
Out[58]=
In[59]:=
partCmp = pwlTotComp[naturals]; // RepeatedTiming
Out[59]=
In[60]:=
partDef === partBlt === Normal@partCmp
Out[60]=

Publisher

Daniele Gregori

Version History

  • 1.0.0 – 05 December 2025

Related Resources

License Information