Function Repository Resource:

KeywordPlot

Source Notebook

Plot the density of keywords in a piece of text

Contributed by: Vitaliy Kaurov and Michael Sollami

ResourceFunction["KeywordPlot"][text,keywords]

generates a plot of the keyword's density within the given text.

ResourceFunction["KeywordPlot"][keywords]

represents an operator form of ResourceFunction["KeywordPlot"] that can be applied to any string.

Details and Options

The argument text must be a string.
The argument keywords may be a single string or list of strings.
ResourceFunction["KeywordPlot"] takes the normal complement of plotting options along with the following:
"PlotFunction"Automaticspecify which plotting function to use; currently only SmoothHistogram is supported
"TopN"Allspecify to plot only the top‐n most frequent keywords

Examples

Basic Examples (3) 

Plot the density of characters in a string:

In[1]:=
ResourceFunction["KeywordPlot"][{"a", "b", "c"}, ImageSize -> 300]@"aaaaaaabbbbbbbbbbbbbbccc"
Out[1]=

Plot the density of certain keywords in "Alice in Wonderland":

In[2]:=
text = ExampleData[{"Text", "AliceInWonderland"}];
keywords = {"rabbit", "cat", "Queen"};
ResourceFunction["KeywordPlot"][text, keywords, PlotLegends -> Placed[keywords, {{.15, .8}}], PlotTheme -> "Business"]
Out[2]=

Plot character mentions through a play:

In[3]:=
lesmis = Import["http://www.gutenberg.org/files/135/135-0.txt"];
characters = (Last@StringSplit@#["Label"]) & /@ WikidataData[
    ExternalIdentifier["WikidataID", "Q180736", <|"Label" -> "Les Misérables", "Description" -> "1862 Victor Hugo novel"|>], ExternalIdentifier["WikidataID", "P674", <|"Label" -> "characters", "Description" -> "characters which appear in this item (like plays, operas, operettas, books, comics, films, TV series, video games)"|>]];
In[4]:=
Manipulate[Quiet@ResourceFunction["KeywordPlot"][lesmis,
   characters,
   "TopN" -> topN, PlotLabel -> "Top " <> ToString@topN <> " Characters in Les Miserables",
   BaseStyle -> White, PlotTheme -> "Marketing", ImageSize -> 500, PlotRangePadding -> Scaled[.05]], {{topN, 3}, Range[1, 5], ControlType -> SetterBar}]
Out[4]=

Options (1) 

Plot the density of characters in a string:

In[5]:=
ResourceFunction["KeywordPlot"][{"a", "b", "c"}, ImageSize -> 300, "PlotFunction" -> SmoothHistogram]@"aaaaaaabbbbbbbbbbbbbbccc"
Out[5]=

Possible Issues (2) 

If a keyword is not found in the text, it is automatically dropped from the plot:

In[6]:=
ResourceFunction[
  "KeywordPlot"][{"a", "b", "z"}]@"aaaaaaabbbbbbbbbbbbbbccc"
Out[6]=

Of course, if no keywords are found, the plot will fail:

In[7]:=
ResourceFunction["KeywordPlot"]["e"]@"aaaaaaabbbbbbbbbbbbbbccc"
Out[7]=

If keyword densities are too low, the plots may be look strange and spikey, e.g. relatively rare Greek god mentions in the Latin poem Aeneid:

In[8]:=
aeneid = ExampleData[{"Text", "AeneidEnglish"}];
gods = #["Name"] & /@ EntityList[EntityClass["Mythology", "Greek"]];
In[9]:=
Quiet@ResourceFunction["KeywordPlot"][aeneid, gods, PlotLabel -> "Gods in Aeneid"]
Out[9]=

Publisher

Michael Sollami

Version History

  • 1.0.0 – 03 April 2020

Source Metadata

Related Resources

License Information