Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Create a timestamped transcript of speech from audio
ResourceFunction["SpeechRecognizeWithTimestamps"][audio] recognizes speech in audio and returns it as a list of {time, string} pairs. |
Method | Automatic | method for SpeechRecognize to use |
TargetDevice | "CPU" | the device on which to perform recognition |
"TimeResolution" | 5 | size of time increments used |
Recognize the words and the start-times of speech in audio:
In[1]:= |
In[2]:= |
Out[2]= |
Times represent the start-time of a given piece of transcript. Time period that do not contain speech are not included in the output:
In[3]:= |
Out[3]= |
In[4]:= |
Out[4]= |
By default, the times will be given in 5 second increments:
In[5]:= |
Out[5]= |
Different increment sizes can be specified with the option "TimeResolution":
In[6]:= |
Out[6]= |
Quantities can be used:
In[7]:= |
Out[7]= |
This work is licensed under a Creative Commons Attribution 4.0 International License