Function Repository Resource:

ElevenLabsHighlightSpeak

Source Notebook

Dynamically highlight content as it is spoken

Contributed by: Bob Sandheinrich

ResourceFunction["ElevenLabsHighlightSpeak"][text]

creates an interface highlighting words in text as they are spoken.

ResourceFunction["ElevenLabsHighlightSpeak"][list]

attempts to intelligently speak the elements of list while highlighting parts.

ResourceFunction["ElevenLabsHighlightSpeak"][,prop]

returns the specified property.

Details and Options

ElevenLabsHighlightSpeak requires an ElevenLabs service connection with a valid API Key. Free trial keys are available by registering at with ElevenLabs.
For non-string inputs, ResourceFunction["ElevenLabsHighlightSpeak"] requires access to an LLM. By default, this is enabled by LLM Kit.
Values supported for the property prop include:
"List"list with a AudioStream and dynamicly highlighted content
"Interface"content along with a button for playing the speech
ResourceFunction["ElevenLabsHighlightSpeak"][expr] is equivalent to ResourceFunction["ElevenLabsHighlightSpeak"][expr,"Interface"].
ResourceFunction["ElevenLabsHighlightSpeak"] supports the following options:
"HighlightStyle"BackgroundGreenStyle specification for the highlighted content
"HighlightSize""Word"granularity to highlight within text
"SpokenStringMethod"Automatichow to convert expressions into strings
LLMEvaluator$LLMEvaluatorservice for generating spoken strings with "SpokenStringMethod""LLM"
"CacheAudio"Truewhether to cache audio generation
"HighlightSize" supports "Word" or "Character".
"SpokenStringMethod" accepts the following values:
SpokenStringlocal text creation with SpokenString
{"SplitList",f}applies f only to non-string components
"LLM"uses LLMFunction to create spoken strings
farbtrary function
The default "SpokenStringMethod"Automatic is equivalent to "SpokenStringMethod"{"SplitList","LLM"}.
In the "Interface" result, the play button becomes a pause button during playback.

Examples

Basic Examples (4) 

Create an interface for playing spoken text while it is highlighted:

In[1]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"]["Here is my first example"]
Out[1]=

Non-string expressions are highlighted in whole:

In[2]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"][(-b \[PlusMinus] Sqrt[b^2 - 4 a c])/(2 a)
 ]
Out[2]=

Include a mix of text and mathematical expressions:

In[3]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"][{"this is a list with math ", a x^2/3, ". here is more text, and more math: ", HoldForm[1 + 2 + 3]}]
Out[3]=

Create an output containing text:

In[4]:=
ResourceFunction["RandomText"][5]
Out[4]=
In[5]:=
cell = PreviousCell[]
Out[5]=

Use the CellObject as an input:

In[6]:=
ResourceFunction["ElevenLabsHighlightSpeak"][cell]
Out[6]=

Use the Cell expression:

In[7]:=
cellexpr = NotebookRead@cell
Out[7]=
In[8]:=
ResourceFunction["ElevenLabsHighlightSpeak"][cellexpr]
Out[8]=

Scope (2) 

Get the audio stream and highlight content without a pre-built interface:

In[9]:=
{stream, content} = ResourceFunction["ElevenLabsHighlightSpeak"][
  "Here is the stream and content to work with directly", "List"]
Out[9]=

Play the stream to see the highlighting:

In[10]:=
AudioPlay[stream]
Out[10]=

Speak a combination of text and code:

In[11]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"][{"This works: ", HoldForm[Table[Nest[f, x, i], {i, 0, 4}]], " But, this is better: ",
   HoldForm[NestList[f, x, 4]]}]
Out[11]=

Options (7) 

ElevenLabsParameters (2) 

Choose a voice from the ElevenLabs service connection:

In[12]:=
voice = RandomChoice[ServiceExecute["ElevenLabs", "ListVoices"]]
Out[12]=

Speak a combination of text and code using the selected voice::

In[13]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"][{"I wrote this code: ", HoldForm[NestList[f, x, 4]]}, "ElevenLabsParameters" -> {"Voice" -> voice}]
Out[13]=

HighlightStyle (1) 

Control the highlight styling:

In[14]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"]["Make these words red as you read them", "HighlightStyle" -> Red]
Out[14]=

HighlightSize (2) 

Highlight each character instead of each word in the text:

In[15]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"]["Show each character as it is read", "HighlightSize" -> "Character"]
Out[15]=

Also set the styling to something weird:

In[16]:=
ResourceFunction[
 "ElevenLabsHighlightSpeak"]["Show each character as it is read", "HighlightStyle" -> {Large, Italic, Underlined, Purple}, "HighlightSize" -> "Character"]
Out[16]=

CacheAudio (2) 

By default, audio responses are cached in memory for fast result on repeated requests:

In[17]:=
AbsoluteTiming@
 ResourceFunction["ElevenLabsHighlightSpeak"][
  "See how long it takes to run this", "CacheAudio" -> True]
Out[17]=
In[18]:=
AbsoluteTiming@
 ResourceFunction["ElevenLabsHighlightSpeak"][
  "See how long it takes to run this", "CacheAudio" -> True]
Out[18]=

Turn off the caching:

In[19]:=
AbsoluteTiming@
 ResourceFunction["ElevenLabsHighlightSpeak"][
  "See how long it takes to run this", "CacheAudio" -> False]
Out[19]=

Possible Issues (2) 

Cell content that is hard to read as natural language gives strange results:

In[20]:=
RandomImage[]
Out[20]=
In[21]:=
ResourceFunction["ElevenLabsHighlightSpeak"][PreviousCell[]]
Out[21]=

Large content is summarized automatically to save time and cost:

In[22]:=
AbsoluteTiming[{stream, display} = ResourceFunction["ElevenLabsHighlightSpeak"][RandomImage[], "List"];]
Out[22]=
In[23]:=
display
Out[23]=

The spoken text is usually not helpful:

In[24]:=
AudioPlay[stream]
Out[24]=

Version History

  • 1.0.0 – 10 January 2025

Related Resources

Author Notes

Many improvements are possible including both chunking of inputs and prettier styling of highlighted text. I hope to improve this in the future and possibly generalize it to other text-to-speech services. Currently, ElevenLabs gives the best timestamp information through the service connection.

License Information