Function Repository Resource:

WordAffixStructure

Source Notebook

Separate the stem of an English word from its affixes

Contributed by: Mark Greenberg

ResourceFunction["WordAffixStructure"][wd]

gives <|m1lbl1,m2lbl2,|> labeling each part mi of the English word wd as either “prefix”, “stem”, “suffix” or “inflection”.

ResourceFunction["WordAffixStructure"][{wd1,wd2,}]

gives a list of associations, one for each wd.

Details and Options

The input wd must be a string.
The parsing rules are specific to English (American and British), so attempts to input other languages will not produce meaningful results.
ResourceFunction["WordAffixStructure"] attempts to separate the word stem from each of its affixes.
ResourceFunction["WordAffixStructure"] leaves compound word stems as one unit.
ResourceFunction["WordAffixStructure"] preserves capitalization of the initial letter.
ResourceFunction["WordAffixStructure"] takes the option OutputFormat:
OutputFormat"LabeledParts"determines which output format to use
Possible settings for OutputFormat include:
"LabeledParts"association(s) in which the keys are parts of wd and the values are "prefix", "stem", "suffix" or "inflection"
"NestedLists"{{pre1,pre2,},stem,{suf1,suf2,,{inf1,inf2,}}
"MarkupString"wd with "~" inserted after each prefix, "•" inserted before each morphological suffix and * inserted before each inflectional suffix
"NestedBoxes"a visualization of the affix structure

Examples

Basic Examples (2) 

Reveal the affix structure of an English word:

In[1]:=
ResourceFunction["WordAffixStructure"]["incomprehensions"]
Out[1]=

Reveal the affix structure of a list of English words:

In[2]:=
ResourceFunction[
 "WordAffixStructure"][{"containment", "mandible", "redoubling"}]
Out[2]=

Scope (2) 

WordAffixStructure attempts to reveal structure of coinages that are not in the dictionary:

In[3]:=
ResourceFunction["WordAffixStructure"]["Wolframization"]
Out[3]=

WordAffixStructure attempts to separate foreign inflectional endings in English words:

In[4]:=
ResourceFunction["WordAffixStructure"]["arthritides"]
Out[4]=

Options (1) 

The OutputFormat option changes the way the structure is returned:

In[5]:=
settings = {"LabeledParts", "NestedLists", "MarkupString", "NestedBoxes"};
Column[Table[
  ResourceFunction["WordAffixStructure"]["envisioning", OutputFormat -> of], {of, settings}]]
Out[6]=

Applications (2) 

Analyze the affix proportions in a piece of text (here a quote by Eleanor Roosevelt):

In[7]:=
words = TextWords[{"The future belongs to those who believe in the beauty of their dreams."}];
data = Association[ResourceFunction["WordAffixStructure"][words]];
PieChart[Counts[data], ChartLabels -> Automatic, ImageSize -> Small]
Out[9]=

Parse words to learn about roots, prefixes and suffixes:

In[10]:=
words = {"deduction", "introduced", "ductile"};
Column[ResourceFunction["WordAffixStructure"][words, OutputFormat -> "NestedBoxes"]]
Out[11]=

Possible Issues (1) 

Word parts do not always divide cleanly into prefixes, stems and suffixes. WordAffixStructure tries to make reasonable choices when there is uncertainty. In this example, "demo" could be a prefix or a part of the stem, and "crat" could be considered a suffix or a part of the stem. Since a word must have a stem, the function divides it as you see here:

In[12]:=
ResourceFunction["WordAffixStructure"]["democrat"]
Out[12]=

Publisher

Mark Greenberg

Version History

  • 1.0.1 – 19 January 2022
  • 1.0.0 – 17 January 2020

Related Resources

License Information