Function Repository Resource:

StringVenn

Source Notebook

Return a Venn set for the letters in two strings

Contributed by: Ed Pegg Jr

ResourceFunction["StringVenn"][left, right]

returns {left complement, intersection, right complement} for strings left and right.

ResourceFunction["StringVenn"][left, right]

returns {leftcomplement,intersection,rightcomplement} for strings left and right.

Details

Let LEFT="mathematica" and RIGHT="chambermaid". Count the "a" in both strings for multiplicity.
UnsortedMultisetComplement (left)"att"letters in LEFT but not in RIGHT
MultisetIntersection"aacehimm"letters in both LEFT and RIGHT
UnsortedMultisetComplement (right)"brd"letters in RIGHT but not in LEFT
StringVenn on LEFT and RIGHT gives {"att","aacehimm","brd"}. The first two strings together are an anagram of "mathematica". The last two strings together are an anagram of "chambermaid".
MultisetIntersection gives a sorted list of the letters common to both strings. The number of occurrences of each distinct letter is the minimum of the counts in each of the strings.
UnsortedMultisetComplement gives the letters in left that are in excess of those in right, maintaining the original order.

Examples

Basic Examples (2) 

Find the left complement, intersection and right complement of two strings:

In[1]:=
result = ResourceFunction["StringVenn"]["abcdef", "defghi"]
Out[1]=

A string union can be found by applying StringJoin to the result:

In[2]:=
StringJoin@result
Out[2]=

Words with repeated letters (multiset) can be used:

In[3]:=
 ResourceFunction[
 "StringVenn"]["Sherlock Holmes", "The Roy Rogers Show"]
Out[3]=

Verify "Sherlock Holmes" has the letters of "horseshoe":

In[4]:=
 ResourceFunction["StringVenn"][
 ToLowerCase@"Sherlock Holmes", "horseshoe"]
Out[4]=

Scope (3) 

Find words with letters similar to those in "Mathematica":

In[5]:=
result = Select[ToLowerCase /@ WordList[], 8 < StringLength[#] < 15 && StringLength[
      StringJoin@
       Drop[( ResourceFunction[
          "StringVenn"] @@ {"mathematica", #}), {2}]] < 6 &]
Out[5]=

Look at the Venn sets:

In[6]:=
Column[{#,  ResourceFunction["StringVenn"]["mathematica", #]} & /@ result]
Out[6]=

Longer word lists can give other results:

In[7]:=
 ResourceFunction["StringVenn"]["email attachment", "mathematica"]
Out[7]=

Neat Examples (3) 

Apply StringVenn to two of the longest words that don't repeat a sound character:

In[8]:=
 ResourceFunction["StringVenn"]["creditworthiness", "biotechnologies"]
Out[8]=

Apply StringVenn to their sound characters:

In[9]:=
 ResourceFunction["StringVenn"]["krɛdɪtwɝðinəs", "baɪoʊtɛknɑlədʒiz"]
Out[9]=

Verify two strings are perfect anagrams of each other:

In[10]:=
 ResourceFunction[
 "StringVenn"] @@ {"super nes classic edition", "presidential succession"}
Out[10]=

Check the longest well-mixed non-deliberate anagram in Wikipedia:

In[11]:=
 ResourceFunction[
 "StringVenn"] @@ (ToLowerCase /@ {"Bering Sea Gold Under The Ice", "Turbocharged diesel engine"})
Out[11]=

Find 13-letter words with letters similar to those in "daguerreotype":

In[12]:=
Select[WordList[], StringLength[#] == 13 && StringLength[
     StringJoin@
      Drop[( ResourceFunction[
         "StringVenn"] @@ {"daguerreotype", #}), {2}]] < 7 &]
Out[12]=

Check the Venn sets for the other two words:

In[13]:=
 ResourceFunction["StringVenn"][
   "daguerreotype", #] & /@ {"groundskeeper", "superordinate"}
Out[13]=

Requirements

Wolfram Language 14.0 (January 2024) or above

Version History

  • 1.0.0 – 13 November 2024

Related Resources

License Information