Images

Images can be added to the prompt for models that support this feature (like gpt-4o). Use the defImages function to declare the images. Supported images will vary with models but typically include PNG, JPEG, WEBP, and GIF. Both local files and URLs are supported.

defImages(env.files)

URLs

Public URLs (that do not require authentication) will be passed directly to OpenAI.

defImages(
    "https://github.com/microsoft/genaiscript/blob/main/docs/public/images/logo.png?raw=true"
)

Local files are loaded and encoded as a data uri.

Buffer, Blob, ReadableStream

The defImages function also supports Buffer, Blob, ReadableStream.

This example takes a screenshot of bing.com and adds it to the images.

const page = await host.browse("https://bing.com")
const screenshot = await page.screenshot() // returns a node.js Buffer
defImages(screenshot)

Detail

OpenAI supports a “low” / “high” field. An image in “low” detail will be downsampled to 512x512 pixels.

defImages(img, { detail: "low" })

Cropping

You can crop a region of interest from the image.

defImages(img, { crop: { x: 0, y: 0, w: 512, h: 512 } })

Auto crop

You can also automatically remove uniform color on the edges of the image.

defImages(img, { autoCrop: true })

Greyscale

You can convert the image to greyscale.

defImages(img, { greyscale: true })

Rotate

You can rotate the image.

defImages(img, { rotate: 90 })

Scale

You can scale the image.

defImages(img, { scale: 0.5 })

Flip

You can flip the image.

defImages(img, { flip: { horizontal: true; vertical: true } })

Max width, max height

You can specify a maximum width, maximum height. GenAIScript will resize the image to fit into the constraints.

defImages(img, { maxWidth: 800 })
// and / or
defImages(img, { maxHeight: 800 })