7. Instruction Optimization#

We use the term instruction optimization to refer to the problem of finding the task instructions that maximize some target metric (e.g., accuracy).

Note

We will work with an extremely small number of data instances here to show the general flow. We recommend using 100+ examples for train and test..

We start by initalizing things as before.

Hide code cell source
# %load -r 3:25 _init.py
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import Template, EvaluationScore
from sammo.components import Output, GenerateText, ForEach, Union
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os

if not "OPENAI_API_KEY" in os.environ:
    raise ValueError("Please set the environment variable OPENAI_API_KEY'.")

_ = sammo.setup_logger("WARNING")  # we're only interested in warnings for now

runner = OpenAIChat(
    model_id="gpt-3.5-turbo-16k",
    api_config={"api_key": os.getenv("OPENAI_API_KEY")},
    cache=os.getenv("CACHE_FILE", "cache.tsv"),
    timeout=30,
)
Hide code cell source
# %load -s load_data,accuracy _init.py
def load_data(
    url="https://github.com/google/BIG-bench/raw/main/bigbench/benchmark_tasks/implicatures/task.json",
):
    task = json.loads(requests.get(url).content)
    # convert label to single string
    for x in task["examples"]:
        x["output"] = max(x["target_scores"], key=x["target_scores"].get)

    return DataTable.from_records(
        task["examples"],
        input_fields="input",
        constants={"instructions": task["task_prefix"]},
    )


def accuracy(y_true: DataTable, y_pred: DataTable) -> EvaluationScore:
    y_true = y_true.outputs.normalized_values()
    y_pred = y_pred.outputs.normalized_values()
    n_correct = sum([y_p == y_t for y_p, y_t in zip(y_pred, y_true)])

    return EvaluationScore(n_correct / len(y_true))

7.1. Step 1: Defining the set of initial candidates#

Our plan is to use beam search with mutation operators to refine a set of initial candidates. Similar to using grid search previously, we can use the same syntax to define a parametric set of initial candidates.

7.1.1. Using Callables to bind static values#

A very common problem is that of having a set of static values, e.g., configuration or input datasets, that are needed in constructing a metaprompt.

To bind these static values, we recommend using callables. These are objects that behave like functions but can be initalized with the static values for the task. In essence, they behave like partially bound functions but offer a cleaner interface.

Below, we show how we can bind the training dataset to the search space object so we can use its values during the construction of the initial candidate space.

from sammo.instructions import MetaPrompt, Section, Paragraph, InputData
from sammo.dataformatters import PlainFormatter
from sammo.search_op import one_of

class InititialCandidates:
    def __init__(self, dtrain):
        self.dtrain = dtrain

    def __call__(self):
        example_formatter = PlainFormatter(
            all_labels=self.dtrain.outputs.unique(), orient="item"
        )

        labels = self.dtrain.outputs.unique()
        instructions = MetaPrompt(
            [
                Paragraph("Instructions: "),
                Paragraph(
                    one_of(
                        [
                            self.dtrain.constants["instructions"],
                            "",
                            "Find the best output label given the input.",
                            self.dtrain.constants["instructions"] * 2,
                        ]
                    ),
                    reference_id="instructions",
                ),
                Paragraph("\n"),
                Paragraph(
                    f"Output labels: {', '.join(labels)}\n" if len(labels) <= 10 else ""
                ),
                Paragraph(InputData()),
                Paragraph("Output: "),
            ],
            render_as="raw",
            data_formatter=example_formatter,
        )

        return Output(
            instructions.with_extractor("raise"),
            minibatch_size=1,
            on_error="empty_result",
        )

7.2. Step 2: Define a set of mutation operators#

In each step of the beam search, SAMMO will sample a set of mutation operators and apply them to the current set of active candidates (beams).

from sammo.mutators import BagOfMutators, InduceInstructions, Paraphrase

mydata = load_data()
d_train = mydata.sample(10, seed=42)

mutation_operators = BagOfMutators(
    InititialCandidates(d_train),
    InduceInstructions("#instructions", d_train),
    Paraphrase("#instructions"),
    sample_for_init_candidates=False,
)

What we have done above is to define a set of mutators to be applied. We say that we want to initialize with our previously defined InitialCandidates set, and can apply two different mutation operations here: we can induce new instructions from labeled samples, or just paraphrase existing ones. To know what part of the metaprompt we want to apply a mutator to, we need to pass a path descriptor.