Function Repository Resource:

WordCompounds

Source Notebook

Separate parts of a compound word

Contributed by: Mark Greenberg

ResourceFunction["WordCompounds"][wd]

gives a list of ways wd can be split into two complete words.

ResourceFunction["WordCompounds"][wd,min]

gives a list of ways wd can be split into two complete words of min characters or more. min should be a positive integer.

Details and Options

ResourceFunction["WordCompounds"][wd,Languagelang] selects a language other than the default English.

Examples

Basic Examples (1) 

Split a compound word into its two parts:

In[1]:=
ResourceFunction["WordCompounds"]["taillights"]
Out[1]=

Scope (4) 

If there are multiple ways to split wd into two valid words, then all are given:

In[2]:=
ResourceFunction["WordCompounds"]["pinstripe"]
Out[2]=

The input wd does not have to be a known word:

In[3]:=
ResourceFunction["WordCompounds"]["fastfooted"]
Out[3]=

Capitalization is preserved:

In[4]:=
ResourceFunction["WordCompounds"]["Goldwater"]
Out[4]=

In languages like German, where the second part of the compound may be capitalized differently from the dictionary entry, a match is still found and the results preserve the capitalization of wd:

In[5]:=
ResourceFunction["WordCompounds"]["Fingerspitzengefühl", Language -> "German"]
Out[5]=

Options (2) 

Select a language other than English:

In[6]:=
ResourceFunction["WordCompounds"]["jazzzanger", Language -> "Dutch"]
Out[6]=

The minimum string length of the two parts of the compound is set to 3 by default, but you can change this with the optional second argument:

In[7]:=
Column[Table[
  ResourceFunction["WordCompounds"]["refuse", min], {min, 2, 4}]]
Out[7]=

Possible Issues (2) 

Sometimes words that are not compounds get divided:

In[8]:=
ResourceFunction["WordCompounds"]["anthers"]
Out[8]=

In some languages, the parts of compounds differ from the original words. In this example, the two parts of the compound are "Schwein" and "Hund", but there is an "e" between them. WordCompounds will not find the match in these cases:

In[9]:=
ResourceFunction["WordCompounds"]["Schweinehund", Language -> "German"]
Out[9]=

Publisher

Mark Greenberg

Version History

  • 1.0.0 – 31 July 2019

License Information