SAMMO Express (beta)

Contents

9. SAMMO Express (beta)#

One of the more time-consuming tasks is converting an existing prompt into a prompt program. SAMMO Express is now able to do this using a Markdown file.

We start with a prompt written in Markdown. SAMMO additionally recognizes:

CSS-like classes in the form of .classname
CSS-like identifiers in the form of #id
Native placeholders in handlebar.js syntax for the input like {{{input}}}

Here is an example:

Using sammo.express, we can automatically map the structure implied by Markdown into a structred symbolic prompt program:

from sammo.express import MarkdownParser
spp = MarkdownParser(PROMPT_IN_MARKDOWN).get_sammo_program()
spp.plot_program()

Let’s execute it on some data. For this small test, we will skip the DataTables and use a list of dicts.

Output(GenerateText(spp)).run(runner, [{"input": "No of users starting with J"}])

+------------------------------------------+-----------------------------------------------------+
| input                                    | output                                              |
+==========================================+=====================================================+
| {'input': 'No of users starting with J'} | SELECT COUNT(name) FROM Users WHERE name LIKE 'J%'; |
+------------------------------------------+-----------------------------------------------------+
Constants: None

9.1. Bonus: Optimizing the prompt program#

d_train =   DataTable.from_records([{"input": "Get all users whose name starts with the letter 'J'",
    "output": "SELECT * FROM Users WHERE name LIKE 'J%';"
  },
  {
    "input": "Retrieve the youngest user's information",
    "output": "SELECT * FROM Users ORDER BY age ASC LIMIT 1;"
  },
  {
    "input": "Get all cities where users live",
    "output": "SELECT DISTINCT city FROM Users;"
  }])

def accuracy(y_true: DataTable, y_pred: DataTable) -> EvaluationScore:
    y_true = y_true.outputs.normalized_values()
    y_pred = y_pred.outputs.normalized_values()
    n_correct = sum([y_p == y_t for y_p, y_t in zip(y_pred, y_true)])

    return EvaluationScore(n_correct / len(y_true))

from sammo.search import BeamSearch
from sammo.mutators import BagOfMutators, Paraphrase, Rewrite

mutation_operators = BagOfMutators(
    Output(GenerateText(spp)),
    Paraphrase("#instr"),
    Rewrite("#examp", "Repeat these examples and add two new ones.\n\n {{{{text}}}}")
)
prompt_optimizer = BeamSearch(
            runner,
            mutation_operators,
            accuracy,
            depth=1,
            mutations_per_beam=2,
            n_initial_candidates=2
    )
prompt_optimizer.fit(d_train)
prompt_optimizer.show_report()

search depth[############]1/1[00:00<00:00] >> eval[#################################]3/3 >> tasks[#######]9/9[00:00<00:00, 600.00it/s]

Fitting log (5 entries):
iteration    action      objective           costs                         parse_errors    prev_actions
-----------  ----------  ------------------  ----------------------------  --------------  ----------------------
-1           init        0.3333333333333333  {'input': 386, 'output': 33}  0.0             ['init']
-1           init        0.3333333333333333  {'input': 386, 'output': 33}  0.0             ['init']
0            Rewrite     0.6666666666666666  {'input': 437, 'output': 28}  0.0             ['Rewrite', 'init']
0            Paraphrase  0.6666666666666666  {'input': 380, 'output': 28}  0.0             ['Paraphrase', 'init']
0            Rewrite     0.6666666666666666  {'input': 437, 'output': 28}  0.0             ['Rewrite', 'init']
Action stats:
action      stats
----------  ----------------------------
Rewrite     {'chosen': 2, 'improved': 2}
Paraphrase  {'chosen': 1, 'improved': 1}

prompt_optimizer.best_prompt.plot_program()