Skip to content
A small, square minimalist illustration shows a geometric computer interface using colorful flat shapes. A paintbrush and an image icon are arranged in a simple design, with symbolic elements alluding to technology, large language model providers, and cloud connectivity. The image is abstract, highly stylized, and uses only five solid colors without any text, people, shadow, or background, highlighting a clean, modern, corporate aesthetic.

Image Generation

GenAIScript support LLM providers with OpenAI-compatible image generation APIs.

You will need to configure a LLM provider that support image generation.

The top-level script (main) cannot be configured to generate an image at the moment; it has be done a function call to generateImage.

generateImage takes a prompt and returns an image URL and a revised prompt (optional).

const { image, revisedPrompt } = await generateImage(
`a cute cat. only one. photographic, high details. 4k resolution.`
)

The image object is an image file that can be passed around for further processing.

env.output.image(image.filename)

The generateImage function supports transformation options directly in the options parameter. You can apply transformations like resizing, cropping, rotating, and more during image generation.

const { image } = await generateImage(
`a landscape photo of mountains`,
{
maxWidth: 800,
maxHeight: 600,
quality: "high",
size: "landscape"
}
)

The same transformation options available for defImages are supported:

  • maxWidth, maxHeight: Resize image to fit within dimensions
  • crop: Crop to specific region { x: 0, y: 0, w: 512, h: 512 }
  • autoCrop: Remove uniform color edges automatically
  • scale: Apply scaling factor (e.g., 0.5 for half size)
  • rotate: Rotate by degrees (e.g., 90)
  • flip: Flip horizontally/vertically { horizontal: true, vertical: true }
  • greyscale: Convert to greyscale
  • mime: Output format ("image/jpeg" or "image/png")

The generateImage function supports an “edit” mode that uses OpenAI’s image editing capabilities to modify existing images using text prompts. This mode maps directly to OpenAI’s image edit API.

// First, you need an existing image to edit
const existingImage = env.files.find(f => f.filename.includes("robot.png"))
// Edit the image using AI
const { image } = await generateImage(
`Add sunglasses to the robot`,
{
mode: "edit",
image: existingImage, // Required for edit mode
size: "1024x1024"
}
)

You can optionally provide a mask to specify which parts of the image should be edited:

const { image } = await generateImage(
`Make the background a sunset scene`,
{
mode: "edit",
image: existingImage,
mask: maskImage, // Optional: specifies areas to edit
quality: "high"
}
)

Requirements for edit mode:

  • mode: "edit" must be specified
  • image parameter is required (the image to edit)
  • mask parameter is optional (specifies which areas to modify)
  • The edit prompt describes the desired changes