Extended POML File Format Design Specification¶
Estimated time to read: 7 minutes
Status: Under implementation
Overview¶
This document describes the design for an extended POML file format that supports mixed content files - files that can contain both pure text (e.g., Markdown) and POML markup elements seamlessly integrated together.
Current Limitations¶
The current POML implementation requires files to be fully enclosed within <poml>...</poml> tags. Even though the outer level <poml>...</poml> can be optional, the markup file is always parsed with one single pass of XML parser. This creates friction when users want to:
- Write primarily text-based documents (like Markdown or Jinja) with occasional POML components
- Usually need to escape characters like <and>in text content
- Gradually migrate existing text files to use POML features
Design Goals¶
- Backward Compatibility: Most of existing POML files should continue to work without changes
- Flexibility: Support pure text files with embedded POML elements
- Seamless Integration: Allow switching between text and POML modes within a single file
File Format Specification¶
Extended POML Files¶
Extended POML files can contain:
- Pure Text Content: Regular text content (Markdown, plain text, etc.)
- POML Element Pairs: Any element pair defined in componentDocs.json(e.g.,<poml>...</poml>,<p>...</p>,<task>...</task>)
- Mixed Content: Combination of pure text and POML elements
Element Detection¶
The system will assume the whole file is a pure text file and detects certain parts as POML elements based on the following:
- Loading component definitions from componentDocs.jsonand extracting valid POML component names and their aliases.
- Scanning for opening tags that match these components, and scanning until the corresponding closing tag is found.
- If a special tag <text>...</text>is found within a POML segment, it will be treated as pure text content and processed following the rules above (step 1 and 2).
An example is shown below:
Example 1¶
# My Analysis Document
This is a regular markdown document that explains the task.
<task>
  Analyze the following data and provide insights.
</task>
Here are some key points to consider:
- Data quality
- Statistical significance  
- Business impact
<examples>
  <example>
    <input>Sample data point 1</input>
    <output>Analysis result 1</output>
  </example>
</examples>
## Conclusion
The analysis shows...
Example 2¶
<poml>
  <task>Process the following data</task>
  <text>
    This is **markdown** content that will be processed as pure text.
    - Item 1
    - Item 2
    {{ VARIABLES_WILL_ALSO_SHOWN_AS_IS }}
    <cp caption="Nested POML">This is a nested POML component that will be processed as POML.</cp>
    No POML processing happens here.
  </text>
  <hint>Remember to check the format</hint>
</poml>
There can be some intervening text here as well.
<poml>
  <p>You can add another POML segment here: {{variable_will_be_substituted}}</p>
</poml>
<p>POML elements do not necessarily reside in a <text><poml> (the <poml> here is processed as is.)</text> element.</p>
Escaping Note: To directly show a POML tag in the text, users can use a <text> tag to wrap the content, as shown in the example above. If they want to escape a pair such as <poml>...</poml>, they can escape the opening tag and closing tag respectively, such as <text><poml></text>...<text></poml></text>.
File-level Metadata¶
Metadatas are information that is useful when parsing and rendering the file, such as context variables, stylesheets, version information, file paths, etc.
File-level metadata can be included at any place of the file in a special <meta> tag. This metadata will be processed before any content parsing.
Architecture Design¶
High-level Processing Pipeline¶
The core of the new architecture is a three-pass process: Segmentation, Metadata Extraction, and Recursive Rendering.
I. Segmentation Pass¶
This initial pass is a crucial preprocessing step that scans the raw file content and partitions it into a hierarchical tree of segments. It does not parse the full XML structure of POML blocks; it only identifies their boundaries.
- Objective: To classify every part of the file as META,POML, orTEXTand build a nested structure.
- Algorithm:
- Load all valid POML component tag names (including aliases) from componentDocs.json. This set of tags will be used for detection.
- Initialize the root of the segment tree as a single, top-level TEXTsegment spanning the entire file, unless the root segment is a single<poml>...</poml>block spanning the whole file (in which case it will be treated as aPOMLsegment).
- Use a stack-based algorithm to scan the text.- When an opening tag (e.g., <task>) that matches a known POML component is found, push its name and start position onto the stack. This marks the beginning of a potentialPOMLsegment.
- When a closing tag (e.g., </task>) is found that matches the tag at the top of the stack, pop the stack. This marks a completePOMLsegment. This new segment is added as a child to the current parent segment in the tree.
- The special <text>tag is handled recursively. If a<text>tag is found inside aPOMLsegment, the scanner will treat its content as a nestedTEXTsegment. ThisTEXTsegment can, in turn, contain morePOMLchildren.
- Any content not enclosed within identified POMLtags remains part of its parentTEXTsegment.
 
- When an opening tag (e.g., 
- <meta>tags are treated specially. They are identified and parsed into- METAsegments at any level but are logically hoisted and processed first. They should not have children.
- Output: A Segmenttree. For backward compatibility, if the root segment is a single<poml>...</poml>block spanning the whole file, the system can revert to the original, simpler parsing model.
Segment Interface: The children property is key to representing the nested structure of mixed-content files.
interface Segment {
  id: string;                      // Unique ID for caching and React keys
  kind: 'META' | 'TEXT' | 'POML';
  start: number;
  end: number;
  content: string;                 // The raw string content of the segment
  parent?: Segment;                 // Reference to the parent segment
  children: Segment[];             // Nested segments (e.g., a POML block within text)
  tagName?: string;                 // For POML segments, the name of the root tag (e.g., 'task')
}
II. Metadata Processing¶
Once the segment tree is built, all META segments are processed.
- Extraction: Traverse the tree to find all METAsegments.
- Population: Parse the content of each <meta>tag and populate the globalPomlContextobject.
- Removal: After processing, METAsegments are removed from the tree to prevent them from being rendered.
PomlContext Interface: This context object is the single source of truth for the entire file, passed through all readers. It's mutable, allowing stateful operations like <let> to have a file-wide effect.
interface PomlContext {
  variables: { [key: string]: any }; // For {{ substitutions }} and <let> (Read/Write)
  texts: { [key: string]: React.ReactElement }; // Maps TEXT_ID to content for <text> replacement (Read/Write)
  stylesheet: { [key: string]: string }; // Merged styles from all <meta> tags (Read-Only during render)
  minimalPomlVersion?: string;      // From <meta> (Read-Only)
  sourcePath: string;                // File path for resolving includes (Read-Only)
}
III. Text/POML Dispatching (Recursive Rendering)¶
Rendering starts at the root of the segment tree and proceeds recursively. A controller dispatches segments to the appropriate reader.
- 
PureTextReader: HandlesTEXTsegments.
- 
Currently we directly render the pure-text contents as a single React element. In future, we can: - Renders the text content, potentially using a Markdown processor.
- Performs variable substitutions ({{...}}) using thevariablesfromPomlContext. The logic fromhandleTextin the originalPomlFileshould be extracted into a shared utility for this.
 
- 
Iterates through its childrensegments. For each childPOMLsegment, it calls thePomlReader.
- 
PomlReader: HandlesPOMLsegments.
- 
Pre-processing: Before parsing, it replaces any direct child <text>regions with a self-closing placeholder tag containing a unique ID:<text ref="TEXT_ID_123" />. The original content of the<text>segment is stored incontext.texts. This ensures the XML parser insidePomlFiledoesn't fail on non-XML content (like Markdown).
- Delegation: Instantiates a modified PomlFileclass with the processed segment content and the sharedPomlContext.
- 
Rendering: Calls the pomlFile.react(context)method to render the segment.
- 
IntelliSense Layer: The segment tree makes it easy to provide context-aware IntelliSense. By checking thekindof the segment at the cursor's offset, the request can be routed to the correct provider—either thePomlReader's XML-aware completion logic or a simpler text/variable completion provider forTEXTsegments.
Reader Interface: This interface defines the contract for both PureTextReader and PomlReader.
interface Reader {
  read(segment: Segment, context: PomlContext?): React.ReactElement;
  getHoverToken(segment: Segment, offset: number): PomlToken | undefined;
  getCompletions(offset: number): PomlToken[];
}
Implementation & PomlFile Refactoring¶
To achieve this design, the existing PomlFile class needs significant refactoring. Its role changes from a file-level controller to a specialized parser for POML segments.
Key Modifications to PomlFile¶
- 
Constructor ( new PomlFile):
- 
Remove Auto-Wrapping: The autoAddPomllogic must be removed. ThePomlReaderwill only pass it well-formed XML content corresponding to a singlePOMLsegment. The constructor will now assume the inputtextis a valid XML string.
- 
Receive Context: The constructor should accept the PomlContextobject to access shared state.
- 
State Management ( handleLet):
- 
The <let>tag's implementation must be modified to read from and write to the sharedPomlContext.variablesobject, not a local context. This ensures that a variable defined in one POML block is available to subsequent POML blocks in the same file.
- 
Handling <include>:
- 
The handleIncludemethod should be removed fromPomlFile. Inclusion is now handled at a higher level by the main processing pipeline. When thePomlReaderencounters an<include>tag, it will invoke the entire pipeline (Segmentation, Metadata, Rendering) on the included file and insert the resulting React elements.
- 
Parsing TEXTPlaceholders:
- 
The core parseXmlElementmethod needs a new branch to handle the<text ref="..." />placeholder.
- When it encounters this element:- It extracts the refattribute (e.g.,"TEXT_ID_123").
- It looks up the corresponding raw text from context.texts.
- It fetches from the context.textsmap and returns a React element containing the pure text content.
 
- It extracts the