Skip to content
A simple 8-bit style graphic features a blocky computer monitor displaying an audio waveform. Nearby, a video file icon transitions into an audio file icon, indicated by an arrow between them. There is also a symbol for SRT subtitles and a small settings gear, all created with flat geometric shapes using five colors against a blank background. There are no people or text in the image, and the size is 128 by 128 pixels.

Audio Transcription

GenAIScript supports transcription and translations from OpenAI like APIs.

const { text } = await transcribe("video.mp4")

The transcription API will automatically use ffmpeg to convert videos to audio files (opus codec in a ogg container).

You need to install ffmpeg on your system. If the FFMPEG_PATH environment variable is set, GenAIScript will use it as the full path to the ffmpeg executable. Otherwise, it will attempt to call ffmpeg directly (so it should be in your PATH).

By default, the API uses the transcription model alias to transcribe the audio. You can also specify a different model alias using the model option.

const { text } = await transcribe("...", { model: "openai:whisper-1" })

For models that support it, you can retreive the individual segments.

const { segments } = await transcribe("...")
for (const segment of segments) {
const { start, text } = segment
console.log(`[${start}] ${text}`)
}

GenAIScript renders the segments to SRT and WebVTT formats as well.

const { srt, vtt } = await transcribe("...")

Some models also support transcribing and translating to English in one pass. For this case, set the translate: true flag.

const { srt } = await transcribe("...", { translate: true })

You can cache the transcription results by setting the cache option to true (or a custom name).

const { srt } = await transcribe("...", { cache: true })

or a custom salt

const { srt } = await transcribe("...", { cache: "whisper" })

You can parse VTT and SRT files using the parsers.transcription function.

const segments = parsers.transcription("WEBVTT...")