autogen_ext.agents.video_surfer.tools#

extract_audio(video_path: str, audio_output_path: str) str[source]#

Extracts audio from a video file and saves it as an MP3 file.

Parameters:
  • video_path – Path to the video file.

  • audio_output_path – Path to save the extracted audio file.

Returns:

Confirmation message with the path to the saved audio file.

get_screenshot_at(video_path: str, timestamps: List[float]) List[Tuple[float, ndarray[Any, Any]]][source]#

Captures screenshots at the specified timestamps and returns them as Python objects.

Parameters:
  • video_path – Path to the video file.

  • timestamps – List of timestamps in seconds.

Returns:

List of tuples containing timestamp and the corresponding frame (image). Each frame is a NumPy array (height x width x channels).

get_video_length(video_path: str) str[source]#

Returns the length of the video in seconds.

Parameters:

video_path – Path to the video file.

Returns:

Duration of the video in seconds.

save_screenshot(video_path: str, timestamp: float, output_path: str) None[source]#

Captures a screenshot at the specified timestamp and saves it to the output path.

Parameters:
  • video_path – Path to the video file.

  • timestamp – Timestamp in seconds.

  • output_path – Path to save the screenshot. The file format is determined by the extension in the path.

transcribe_audio_with_timestamps(audio_path: str) str[source]#

Transcribes the audio file with timestamps using the Whisper model.

Parameters:

audio_path – Path to the audio file.

Returns:

Transcription with timestamps.

async transcribe_video_screenshot(video_path: str, timestamp: float, model_client: ChatCompletionClient) str[source]#

Transcribes the content of a video screenshot captured at the specified timestamp using OpenAI API.

Parameters:
  • video_path – Path to the video file.

  • timestamp – Timestamp in seconds.

  • model_client – ChatCompletionClient instance.

Returns:

Description of the screenshot content.