1. Symbolic Prompt Programs#
At the heart of SAMMO are symbolic prompt programs. If you are familiar with JavaScript, you can think of these of a DOM tree representation of prompts. If you aren’t, no problem – we’ll start from the basics.
Show code cell source
# %load -r 3:25 _init.py
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import Template, EvaluationScore
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os
if not 'OPENAI_API_KEY' in os.environ:
raise ValueError("Please set the environment variable 'OPENAI_API_KEY'.")
_ = sammo.setup_logger("WARNING") # we're only interested in warnings for now
runner = OpenAIChat(
model_id="gpt-3.5-turbo",
api_config={"api_key": os.environ['OPENAI_API_KEY']},
cache=os.getenv("CACHE_FILE", "cache.tsv"),
timeout=30,
)
1.1. Classification: A simple example#
Let’s say we want to improve a prompt for labeling speaker responses. We’ll eventually run it with many different inputs, but plug in a concrete one for now:
Instructions: Does Speaker 2's answer mean yes or no?
Output labels: no, yes
Input: Speaker 1: "You do this often?" Speaker 2: "It's my first time."
Output:
We have roughly four parts here: the instructions, the set of output labels, the input itself and a prefix. Let’s convert this into a symbolic prompt program.
from sammo.components import Output, GenerateText, ForEach, Union, JoinStrings
from sammo.base import Template, VerbatimText
parts = list()
parts.append(VerbatimText(content="Instructions: Does Speaker 2's answer mean yes or no?", reference_id="instructions", reference_classes=["preamble"]))
parts.append(VerbatimText("Output labels: no, yes", reference_id="label_space", reference_classes=["preamble"]))
parts.append(Template("{{input}}", reference_id="input"))
parts.append(VerbatimText("Output: ", reference_id="prefix"))
spp = Output(GenerateText(JoinStrings(*parts, separator="\n")))
spp.plot_program()
plot_program()
plots the structure of our SPP. An SPP is a graph where each node is an operator, or prompt component in SAMMO. You can click on each node to see what symbolic properties it has (and we’ll edit those in a bit).
A component receives as input the values of its children as well as any values that its parents passed through. Let’s see this in action.
spp_result = spp.run(runner, ["Speaker 1: \"You do this often?\" Speaker 2: \"It's my first time.\""])
Okay, nice. Let’s look at the trace.
spp_result.outputs[0].plot_call_trace()
In contrast to the program graph, each node is now filled with concrete values of the computation. By clicking on each node, we can see what output it produced and pass up to its parent.
1.2. The power of symbolic prompt programs#
What have we gained from this? A lot of flexibility to explore and optimize our prompt! Under the hood, SAMMO uses pyGlove to symbolize each class so that we can make arbitrary changes (this is also beyond static DSPy programs). pyGlove turns Python classes into manipulable, symbolic objects whose properties remain fully editable after instantiation.
We can now query and modify prompt programs via a whole host of specifiers, similar to working with a DOM tree. Let’s say we’d like to delete the instructions. To do this, we first find the node using .find_first()
and then use the .rebind()
function that pyGlove provides. You’ll see how the node disappears.
import pyglove as pg
target_node = spp.find_first("#instructions")
spp.clone().rebind({target_node.path: pg.MISSING_VALUE}).plot_program()
With this, you could automate a lot of the (semi)-manual tinkering you have to do during prompt prototyping. Making small edits such as paraphrasing would be just the start. Want to try out Chain-of-Thought reasoning? Add a paragraph that says “Let’s think step-by-step.” You can also explore
Going from single examples to batch annotation
Changing your retriever and ranking function in a RAG scenario
Re-ordering some of the paragraphs
Compressing certain parts of the instructions
Etc.
1.2.1. Querying & manipulating SPPs#
There are several ways to manipulate and query SPPs. SAMMO provides convinience functions .find_first()
and .find_all()
that allow you to query the symbolic program tree using a CSS syntax.
# Query by id
spp.find_first("#instructions")
PyGloveMatch(
node=VerbatimText(
content = "Instructions: Does Spea...,
path=child.child.children[0]
)
# Query by id and attribute
spp.find_first("#instructions content")
PyGloveMatch(
node=Instructions: Does Speaker 2's answer mean yes or ...,
path=child.child.children[0].content
)
# Query by class
spp.find_all(".preamble")
[PyGloveMatch(
node=VerbatimText(
content = "Instructions: Does Spea...,
path=child.child.children[0]
),
PyGloveMatch(
node=VerbatimText(
content = 'Output labels: no, yes'...,
path=child.child.children[1]
)]