Details
ResourceFunction["BlockEntropy"] is similar to
Entropy in that it also computes values from a sum of the form
-∑piLog[pi].
The entropy measurement starts with grouping rows by
Total.
Typically, pi is a frequentist probability of obtaining the ith distinct element by randomly sampling the input list.
ResourceFunction["BlockEntropy"] expects a matrix structure, either of the form data={row1,row2,…}, or implicitly as Partition[list,blocksize] ={row1,row2,…}.
Additionally, BlockEntropy allows for coarse-graining of rows to macrostates using the function
macrofun (default:
Total).
Two rows rowj and rowk, with macrostates mj=macrofun[rowj] and mk=macrofun[rowk], are considered distinct if mj≠mk.
Likewise, two atomistic states dj,x and dk,y are considered distinct if dj,x≠dk,y.
Let 𝒟 be the set of unique atomistic states and ℳ the set of distinct values in the range of macrofun. The joint entropy is then calculated by a double sum, S=-∑ℙ(dj❘mi)ℙ(mi)Log[ℙ(dj❘mi)ℙ(mi)], where indices i and j range over the elements of ℳ and 𝒟 respectively.
The frequentist probability ℙ(mi) , mi∈ℳ equals the count of rows satisfying mi=macrofun[rowj], divided by the total number of rows.
The conditional probability ℙ(dj❘mi), mi∈ℳ, dj∈𝒟 is not necessarily frequentist, but is often assumed or constructed to be so.
The optional function
probfun takes
mi∈ℳ as the first argument and
blocksize as the second argument. It should return an
Association or
List of conditional probabilities
ℙ(dj❘mi).
When probfun is not set, either "Micro" or "Macro" conditional probabilities can be specified by setting the "Statistics" option.
The default "Micro" statistics obtains 𝒟 by taking a
Union over row elements. The conditional probabilities are then calculated as
ℙ(dj❘mi)=∑ℙ(dj❘rowk)ℙ(rowk)=∑ℙ(dj❘rowk) /N, where the sum includes every possible
rowk written over elements 𝒟 and satisfying
mi=macrofun[rowk]. The factor
1/ℙ(rowk)=N equals the
Count of such rows, all assumed equiprobable.
Traditional "Macro" statistics require that 𝒟 contains all possible rows of the correct length whose elements are drawn from the complete set of row elements using
Tuples. The conditional probabilities are then calculated as
ℙ(dj❘mi)=0 if
mi≠macrofun[dj] or if
mi=macrofun[dj] as
ℙ(dj❘mi)=1 /N, with
N equal to the count of atomistic row states
dk satisfying
mi=macrofun[dk].