Skip to content

Token Control

Estimated time to read: 3 minutes

Controlling Characters and Tokens

Warning

This feature is experimental and may change in future releases. Use with caution.

POML controls content length through character limits, token limits, and priority-based truncation. These features are particularly useful when working with AI models that have input constraints or when you need to ensure content fits within specific bounds.

Note

Token control is only supported on components rendered with syntax="text" or syntax="markdown".

Character and Token Limits

You can set soft limits on content using charLimit and tokenLimit attributes. When content exceeds these limits, it will be automatically truncated with a marker.

<poml>
  <!-- Limit content to 100 characters -->
  <p charLimit="100">This is a very long paragraph that will be truncated if it exceeds the character limit. The truncation will add a marker to indicate that content was cut off.</p>

  <!-- Limit content to 50 tokens -->
  <p tokenLimit="10">This paragraph will be truncated based on token count rather than character count, which is more accurate for AI model processing.</p>
</poml>

Renders to:

This is a very long paragraph that will be truncated if it exceeds the character limit. The truncati (...truncated)

This paragraph will be truncated based on token count rather (...truncated)

You can customize truncation behavior using writerOptions to control how content is shortened when it exceeds limits:

  • truncateMarker: The string to append when content is truncated (default: (...truncated))
  • truncateDirection: Where to truncate the content:
  • "end" (default): Keep the beginning, truncate the end
  • "start": Keep the end, truncate the beginning
  • "middle": Keep both beginning and end, truncate the middle
  • tokenEncodingModel: The model to use for token counting (default: "gpt-4o" which uses o200k_base encoding)
<p charLimit="20" writerOptions='{ "truncateMarker": " [...] ", "truncateDirection": "middle"}'>This is a very long paragraph that will be truncated if it exceeds the character limit. The truncation will add a marker to indicate that content was cut off.</p>

Renders to:

This is a  [...] s cut off.

Note

The default tokenizer for counting tokens is based on js-tiktoken with o200k_base encoding (used in gpt-4o through o3 models). You can customize it by specifying the model name in tokenEncodingModel within writerOptions.

Priority-Based Truncation

The priority attribute allows you to control which content is preserved when space is limited. Lower priority content (lower numbers) will be truncated first.

<poml tokenLimit="40">
  <p priority="1">This content has low priority and may be removed first to save space.</p>

  <p priority="3">This content has high priority and will be preserved longer.</p>

  <p priority="2">This content has medium priority.</p>

  <!-- Content without priority defaults to priority 0 (lowest) -->
  <p>This content will be truncated first since it has no explicit priority.</p>
</poml>

Renders to:

This content has low priority and may be removed first to save space.

This content has high priority and will be preserved longer.

This content has medium priority.

If the token limit is reduced further to 8, highest priority content is preserved, and also truncated with a marker:

This content has high priority and will be (...truncated)

Combining Limits and Priority

You can combine different types of limits with priority settings for sophisticated content management.

Token Calculation Order

Token limits are applied hierarchically from parent to child components. When a parent component has a token limit:

  1. Children are processed first with their limits and priorities taken into account.
  2. Sort children by priority. Low priority children are removed entirely if they exceed the limit
  3. Remaining content (including the cases of equal priority) is truncated if still over the limit
  4. charLimit/tokenLimit are applied after priority-based removal

This means in the example below, the entire list component might be removed if higher priority content consumes the available tokens.

<poml tokenLimit="40">
  <h priority="5">Critical Section Header</h>

  <p priority="4" charLimit="10">
    Important introduction that should be preserved but can be shortened individually.
  </p>

  <list priority="2">
    <item priority="3">High priority item</item>
    <item priority="1">Lower priority item</item>
    <item>Lowest priority item (no explicit priority)</item>
  </list>

  <p priority="3" tokenLimit="5">Optional additional context that can be truncated aggressively.</p>
</poml>

Renders to:

# Critical Section Header

Important  (...truncated)

Optional additional context that can (...truncated)