Speech to Text API
Header
Your API key
form-data body
Request Body
- files Filerequired
Audio raw files in a form of multi-part form data using the key name files.
optionalSend with the form of multi-part form data
Default value: 1
Number of speakers diarization in 1-4 speakers.
Default value: false
Get the speaking rate of each transcription for speech fluency analysis in the audio.
Default value: false
Get time offsets of each word that is recognized in the audio.
Default value: []
List of terminology ['word_1', 'word_2', ...]
Default value: LMBeamSearch
Decoding methods including Greedy
, BeamSearch
, and LMBeamSearch
- 200
object
success
| failed
Status of request
data object
results Array [
object
File name
predictions Array [
object
Start time in HH:mm:ss.sss
format
Start time in HH:mm:ss.sss
format
Speaker in SPEAKER_{number}
format
The transcribed text
Speaking rate of each transcription
word_timestamps (optional) Array [
object
Start time in HH:MM:SS.sss format
Start time in HH:MM:SS.sss
format
Word in audio file
{
"status": "success",
"data": {
"results": [
{
"filename": "record.wav",
"predictions": [
{
"speaker": "SPEAKER_00",
"transcript": "วิสัย",
"start_time": "00:00:01.322",
"end_time": "00:00:02.158"
},
{
"speaker": "SPEAKER_00",
"transcript": "บริษัทผู้พัฒนาแพลตฟอร์ม",
"start_time": "00:00:02.721",
"end_time": "00:00:05.145"
},
{
"speaker": "SPEAKER_00",
"transcript": "มีเป้าหมายหลักในการเป็นศูนย์กลางการให้บริการ",
"start_time": "00:00:05.930",
"end_time": "00:00:09.274"
},
{
"speaker": "SPEAKER_00",
"transcript": "การทดลองวิจัย",
"start_time": "00:00:09.803",
"end_time": "00:00:11.203"
},
{
"speaker": "SPEAKER_00",
"transcript": "และแลกเปลี่ยนความรู้ด้านปัญญาประดิษฐ์",
"start_time": "00:00:12.056",
"end_time": "00:00:14.360"
}
]
}
]
}
}