Skip to content
The image is a simple 2D geometric illustration featuring two icon-like servers or databases facing each other. The left server blends OpenAI and Azure OpenAI logos into one shape, while the right server displays a basic version of the Anthropic logo. Each server has a small hourglass or clock icon above it. Arrows move between the servers, suggesting data transfer, and abstract shapes surround them to symbolize swift data flow. The design uses five flat, 8-bit style corporate colors, avoiding any people, text, backgrounds, shadows, or gradients, and the whole scene is arranged within a 128x128 pixel square.

Prompt Caching

Prompt caching is a feature that can reduce processing time and costs for repetitive prompts. It is supported by various LLM providers, but the implementation may vary.

You can mark def section or $ function with cacheControl set as "ephemeral" to enable prompt caching optimization. This essentially means that it is acceptable for the LLM provider to cache the prompt for a short amount of time.

def("FILE", env.files, { cacheControl: "ephemeral" })
$`Some very cool prompt`.cacheControl("ephemeral")

In most cases, the ephemeral hint is ignored by LLM providers. However, the following are supported

Prompt caching of the prompt prefix is automatically enabled by OpenAI. All ephemeral annotations are removed.

The ephemeral annotation is converted into 'cache-control': { ... } field in the message object.

Note that prompt caching is still marked as beta and not supported in all models (specially the older ones).