Function Repository Resource:

SpeechRecognizeWithTimestamps

Source Notebook

Create a timestamped transcript of speech from audio

Contributed by: Jon McLoone

ResourceFunction["SpeechRecognizeWithTimestamps"][audio]

recognizes speech in audio and returns it as a list of {time, string} pairs.

Details and Options

The options Method and TargetDevice from SpeechRecognize are also supported.

Method

Automatic

method for SpeechRecognize to use

TargetDevice

"CPU"

the device on which to perform recognition

"TimeResolution"

size of time increments used

Examples

Basic Examples (2)

Recognize the words and the start-times of speech in audio:

In[1]:=

In[2]:=

Out[2]=

Times represent the start-time of a given piece of transcript. Time period that do not contain speech are not included in the output:

In[3]:=

Out[3]=

In[4]:=

Out[4]=

Options (3)

By default, the times will be given in 5 second increments:

In[5]:=

Out[5]=

Different increment sizes can be specified with the option "TimeResolution":

In[6]:=

Out[6]=

Quantities can be used:

In[7]:=

Out[7]=

Publisher

Jon McLoone

Version History

1.0.0 – 22 September 2023

Related Resources

License Information

This work is licensed under a Creative Commons Attribution 4.0 International License