Function Repository Resource:

SpeechRecognizeWithTimestamps

Source Notebook

Create a timestamped transcript of speech from audio

Contributed by: Jon McLoone

ResourceFunction["SpeechRecognizeWithTimestamps"][audio]

recognizes speech in audio and returns it as a list of {time, string} pairs.

Details and Options

The options Method and TargetDevice from SpeechRecognize are also supported.
MethodAutomaticmethod for SpeechRecognize to use
TargetDevice"CPU"the device on which to perform recognition
"TimeResolution"5size of time increments used

Examples

Basic Examples (2) 

Recognize the words and the start-times of speech in audio:

In[1]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/e46b981f-4a06-4f9c-8dd1-e480a9e12e04"]
In[2]:=
TimelinePlot[Labeled @@@ %]
Out[2]=

Times represent the start-time of a given piece of transcript. Time period that do not contain speech are not included in the output:

In[3]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/e56dce17-ae0f-4437-8bcc-dad2138044ec"]
Out[3]=
In[4]:=
TimelinePlot[Labeled @@@ %]
Out[4]=

Options (3) 

By default, the times will be given in 5 second increments:

In[5]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/b19884a7-0ac9-425e-b8f2-90c67e1dbda5"]
Out[5]=

Different increment sizes can be specified with the option "TimeResolution":

In[6]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/faea2a92-0bc9-4fc3-b87d-acba278a2d5a"]
Out[6]=

Quantities can be used:

In[7]:=
(* Evaluate this cell to get the example input *) CloudGet["https://www.wolframcloud.com/obj/33f9af72-e6bb-4dde-ae48-5b59badf9436"]
Out[7]=

Publisher

Jon McLoone

Version History

  • 1.0.0 – 22 September 2023

Related Resources

License Information