SAMMO Express (beta)

9. SAMMO Express (beta)#

One of the more time-consuming tasks is converting an existing prompt into a prompt program. SAMMO Express is now able to do this using a Markdown file.

Hide code cell content
# %load -r 3:25 _init.py
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import Template, EvaluationScore, Component
from sammo.components import Output, GenerateText, ForEach, Union
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os

if not "OPENAI_API_KEY" in os.environ:
    raise ValueError("Please set the environment variable 'OPENAI_API_KEY'.")

_ = sammo.setup_logger("WARNING")  # we're only interested in warnings for now

runner = OpenAIChat(
    model_id="gpt-3.5-turbo",
    api_config={"api_key": os.environ["OPENAI_API_KEY"]},
    cache=os.getenv("CACHE_FILE", "cache.tsv"),
    timeout=30,
)

We start with a prompt written in Markdown. SAMMO additionally recognizes:

  • CSS-like classes in the form of .classname

  • CSS-like identifiers in the form of #id

  • Native placeholders in handlebar.js syntax for the input like {{{input}}}

Here is an example:

Hide code cell source
PROMPT_IN_MARKDOWN = """
# Instructions <!-- #instr -->
Convert the following user queries into a SQL query.

# Table
Users:
- user_id (INTEGER, PRIMARY KEY)
- name (TEXT)
- age (INTEGER)
- city (TEXT)

# Examples <!-- #examp -->
Input: "Find all users who are older than 30."  
Output: `SELECT name FROM Users WHERE age > 30;`

Input: "List the names of users who live in 'New York'."  
Output: `SELECT name FROM Users WHERE city = 'New York';`
   
# Complete this
Input: {{{input}}}
Output:
"""

Using sammo.express, we can automatically map the structure implied by Markdown into a structred symbolic prompt program:

from sammo.express import MarkdownParser
spp = MarkdownParser(PROMPT_IN_MARKDOWN).get_sammo_program()
spp.plot_program()

Let’s execute it on some data. For this small test, we will skip the DataTables and use a list of dicts.

Output(GenerateText(spp)).run(runner, [{"input": "No of users starting with J"}])
+------------------------------------------+-----------------------------------------------------+
| input                                    | output                                              |
+==========================================+=====================================================+
| {'input': 'No of users starting with J'} | SELECT COUNT(name) FROM Users WHERE name LIKE 'J%'; |
+------------------------------------------+-----------------------------------------------------+
Constants: None

9.1. Bonus: Optimizing the prompt program#

d_train =   DataTable.from_records([{"input": "Get all users whose name starts with the letter 'J'",
    "output": "SELECT * FROM Users WHERE name LIKE 'J%';"
  },
  {
    "input": "Retrieve the youngest user's information",
    "output": "SELECT * FROM Users ORDER BY age ASC LIMIT 1;"
  },
  {
    "input": "Get all cities where users live",
    "output": "SELECT DISTINCT city FROM Users;"
  }])

def accuracy(y_true: DataTable, y_pred: DataTable) -> EvaluationScore:
    y_true = y_true.outputs.normalized_values()
    y_pred = y_pred.outputs.normalized_values()
    n_correct = sum([y_p == y_t for y_p, y_t in zip(y_pred, y_true)])

    return EvaluationScore(n_correct / len(y_true))
from sammo.search import BeamSearch
from sammo.mutators import BagOfMutators, Paraphrase, Rewrite

mutation_operators = BagOfMutators(
    Output(GenerateText(spp)),
    Paraphrase("#instr"),
    Rewrite("#examp", "Repeat these examples and add two new ones.\n\n {{{{text}}}}")
)
prompt_optimizer = BeamSearch(
            runner,
            mutation_operators,
            accuracy,
            depth=1,
            mutations_per_beam=2,
            n_initial_candidates=2
    )
prompt_optimizer.fit(d_train)
prompt_optimizer.show_report()
search depth[############]1/1[00:00<00:00] >> eval[#################################]3/3 >> tasks[#######]9/9[00:00<00:00, 600.00it/s]

Fitting log (5 entries):
iteration    action      objective           costs                         parse_errors    prev_actions
-----------  ----------  ------------------  ----------------------------  --------------  ----------------------
-1           init        0.3333333333333333  {'input': 386, 'output': 33}  0.0             ['init']
-1           init        0.3333333333333333  {'input': 386, 'output': 33}  0.0             ['init']
0            Rewrite     0.6666666666666666  {'input': 437, 'output': 28}  0.0             ['Rewrite', 'init']
0            Paraphrase  0.6666666666666666  {'input': 380, 'output': 28}  0.0             ['Paraphrase', 'init']
0            Rewrite     0.6666666666666666  {'input': 437, 'output': 28}  0.0             ['Rewrite', 'init']
Action stats:
action      stats
----------  ----------------------------
Rewrite     {'chosen': 2, 'improved': 2}
Paraphrase  {'chosen': 1, 'improved': 1}
prompt_optimizer.best_prompt.plot_program()