Skip to content

Structured Outputs

GenAIScript supports the generation of structured outputs with automatic data repairs. It can leverage built-in schema validation from LLM providers or executes it own validation as needed.

Play

The structured output are configured through two flags: responseType, which controls the data format, and responseSchema which controls the data structure.

Response Type

The response type is controlled by the responseType optional argument and has the following options:

  • json: tell the LLM to produce valid JSON output.
  • yaml: tell the LLM to produce valid YAML output.
  • json_object: use built-in OpenAI JSON output
  • json_schema: use built-in OpenAI JSON output with JSON schema validation

Note that text and markdown are also supported to configure the LLM output.

json

In this mode, GenAIScript prompts the LLM to produce valid JSON output. It also validate the output and attempt to repair it if it is not valid. This mode is implemented by GenAIScript and does not rely on LLM providers support.

script({
responseType: "json",
})

The schema validation is applied if the responseSchema is provided.

yaml

In this mode, GenAIScript prompts the LLM to produce valid JSON output. It also validate the output and attempt to repair it if it is not valid. This mode is implemented by GenAIScript and does not rely on LLM providers support.

script({
responseType: "yaml",
})

The schema validation is applied if the responseSchema is provided.

json_object

In this mode, GenAIScript prompts the LLM to produce valid JSON output. It also validate the output and attempt to repair it if it is not valid. This mode relies on built-in support from LLMs, like OpenAI.

script({
responseType: "json_object",
})

json_schema

Structured output is a feature that allows you to generate structured data in data format like with a JSON schema. This is more strict than json_object.

To enable this mode, set responseType to json_schema and provide a responseSchema object.

script({
responseType: "json_schema",
responseSchema: {
type: "object",
properties: {
name: { type: "string" },
age: { type: "number" },
},
required: ["name", "age"],
},
})

Note that there are several restrictions on the schema features supported by this mode.

  • additionalProperties: true is not supported.
  • all optional fields (e.g. not in required) will be returned and might be null

Response Schema

You can specify a schema through responseSchema which will automatically turn on the structured output mode. The output will be validated against the schema, and GenAIScript will attempt to repair the output if it is not valid. The script will fail if the output does not match the schema.

script({
responseType: "json",
responseSchema: {
type: "object",
properties: {
name: { type: "string" },
age: { type: "number" },
},
required: ["name", "age"],
},
})

Inlined schemas

Note that this section applies to the entire output of a chat. You can also use inlined schemas and use a mixed markdown/data that GenAIScript will parse.

Choices

If you are looking to build a LLM-as-a-Judge and only looking for outputs in a set of words, you can also consider using choices to increase the probability of the model generating the specified words.

cast

The cast function is a runtime helper to convert unstructured text/images into structured data.

import { cast } from "genaiscript/runtime"
const { data } = await cast((_) => _.defImages(images), {
type: "object",
properties: {
keywords: {
type: "array",
items: {
type: "string",
description: "Keywords describing the objects on the image",
},
},
},
required: ["keywords"],
})