Skip to main content

Synchronous Speech to Text API

POSThttps://stt.infer.visai.ai/predict
Header
    X-API-Key string required

    Your API key

form-data body

Request Body
  • files Filerequired

    Audio raw files in a form of multi-part form data using the key name files.

    The maximum size for each file is 50 MB.

    The duration of each file should less than 1 minute.

optional

Send with the form of multi-part form data

    files_speakers file

    Speaker file Maximum 5 files can be provided. Each file size must not exceed 20 MB.

    boosting_words string

    Enhances recognition accuracy for specific words. Maximum 10 words can be provided. e.g., สวัสดี

Responses


Array [
An array of the transcription proces.
object
status string

success | failed

The status of the transcription process.

duration string

The total duration of the audio file in seconds (e.g., 20.856).

filename string

File name

result Array [
An array containing the individual transcript segments. Each object in this array represents a segment of the transcription.
object
start_time string

The start time of the segment in the audio file, in seconds (e.g., 1.6382252559726962).

end_time string

The end time of the segment in the audio file, in seconds (e.g., 1.6382252559726962).

speaker string

The identifier of the speaker in SPEAKER_{number} format

transcript string

The transcribed text

]
]
Loading...