
Audio Transcription
GenAIScript supports transcription and translations from OpenAI like APIs.
const { text } = await transcribe("video.mp4")
Configuration
Section titled “Configuration”The transcription API will automatically use ffmpeg to convert videos to audio files (opus codec in a ogg container).
You need to install ffmpeg on your system. If the FFMPEG_PATH
environment variable is set,
GenAIScript will use it as the full path to the ffmpeg executable.
Otherwise, it will attempt to call ffmpeg directly
(so it should be in your PATH).
By default, the API uses the transcription
model alias to transcribe the audio.
You can also specify a different model alias using the model
option.
const { text } = await transcribe("...", { model: "openai:whisper-1" })
Segments
Section titled “Segments”For models that support it, you can retreive the individual segments.
const { segments } = await transcribe("...")for (const segment of segments) { const { start, text } = segment console.log(`[${start}] ${text}`)}
SRT and VTT
Section titled “SRT and VTT”GenAIScript renders the segments to SRT and WebVTT formats as well.
const { srt, vtt } = await transcribe("...")
Translation
Section titled “Translation”Some models also support transcribing and translating to English in one pass. For this case,
set the translate: true
flag.
const { srt } = await transcribe("...", { translate: true })
You can cache the transcription results by setting the cache
option to true
(or a custom name).
const { srt } = await transcribe("...", { cache: true })
or a custom salt
const { srt } = await transcribe("...", { cache: "whisper" })
VTT, SRT parsers
Section titled “VTT, SRT parsers”You can parse VTT and SRT files using the parsers.transcription
function.
const segments = parsers.transcription("WEBVTT...")