2. Components#
In this tutorial, we’ll go into more depth around the building blocks of SPPs – components.
As mentioned before, symbolic prompt programs are essentially graphs of components that are evaluated lazily with .run*
methods.
Show code cell source
# %load -r 3:25 _init.py
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import Template, EvaluationScore
from sammo.components import Output, GenerateText, ForEach, Union
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os
if not 'OPENAI_API_KEY' in os.environ:
raise ValueError("Please set the environment variable 'OPENAI_API_KEY'.")
_ = sammo.setup_logger("WARNING") # we're only interested in warnings for now
runner = OpenAIChat(
model_id="gpt-3.5-turbo",
api_config={"api_key": os.environ['OPENAI_API_KEY']},
cache=os.getenv("CACHE_FILE", "cache.tsv"),
timeout=30,
)
2.1. What is a component, actually?#
A component is simply a lazily evaluated function that gets information from its parents, calls its children, performs some operation on the values and returns a Result
instance.
To better understand the details, let’s write our own component that sorts the output of its child node.
from sammo.base import Result, Component, Runner, VerbatimText
from frozendict import frozendict
class Sort(Component):
async def _call(self, runner: Runner, context: dict, dynamic_context: frozendict | None) -> Result:
intermediate_result = await self._child(runner, context, dynamic_context)
return Result([sorted(intermediate_result.value)], parent=intermediate_result, op=self)
Pretty straightforward! We pass on the intermediate result as a parent as well as a reference to self.
Let’s try it out!
sorted_text = Output(Sort(VerbatimText("zdadf"))).run(runner)
sorted_text
+---------+-----------------------------+
| input | output |
+=========+=============================+
| None | [['a', 'd', 'd', 'f', 'z']] |
+---------+-----------------------------+
Constants: None
Because we passed on a reference to self, we can easily follow the call trace.
sorted_text.outputs[0].plot_call_trace()
2.1.1. Shortcut: LamdaExtractor#
The most flexibile solution to implement a custom function is writing your own Component
which we did above. In many cases, however, it is enough to use a LambdaExtractor
to apply a user-defined function (UDF).
from sammo.extractors import LambdaExtractor
sorted_text_udf = Output(LambdaExtractor(VerbatimText("zdadf"), "lambda x: sorted(x)")).run(runner)
sorted_text_udf
+---------+-----------------------------+
| input | output |
+=========+=============================+
| None | [['a', 'd', 'd', 'f', 'z']] |
+---------+-----------------------------+
Constants: None
2.2. Calling an LLM#
Perhaps the most important component is GenerateText()
which calls an LLM to generate a response.
Using it is fairly straightforward.
first = GenerateText(
"Hello! My name is Peter and I like horses.",
system_prompt="Talk like Shakespeare.",
)
Output(first).run(runner)
+---------+--------------------------------------------------------------+
| input | output |
+=========+==============================================================+
| None | Hark! Good morrow, fair Peter! Thy name doth ring sweetly in |
| | mine ears. Dost thou fancy the noble steeds that roam the |
| | fields? Verily, horses are a wondrous creature, full of |
| | grace and strength. Pray, tell me more of thy love for these |
| | majestic beasts. |
+---------+--------------------------------------------------------------+
Constants: None
To pass on history, you can use the history
argument:
second = GenerateText("Write a four line poem about my favorite animal.", history=first)
poem = Output(second).run(runner)
poem
+---------+------------------------------------------------------------+
| input | output |
+=========+============================================================+
| None | In fields of green, the horse doth roam, With flowing mane |
| | and spirit bold. A creature fair, a sight to behold, In my |
| | heart, its beauty finds a home. |
+---------+------------------------------------------------------------+
Constants: None
When we plot the call trace, we can see that adding history is reflected in the dependencies:
poem.outputs[0].plot_call_trace()
2.3. Loops#
Loops are only needed if you need to loop over anything beyond all inputs. More on this in the next section on parallelization.
2.3.1. Static loops#
A common use case for this is when we want to repeat certain operations for a known number of times, e.g., sample LLM responses N times.
N = 5
fruits = [
GenerateText("Generate the name of a random fruit.", randomness=0.9, seed=i)
for i in range(N)
]
static_loop = Output(Union(*fruits))
static_loop.run(runner)
+---------+-----------------------------------------------------------+
| input | output |
+=========+===========================================================+
| None | ['Honeydewberry', 'Starfruit', 'Lemonberry', 'Starfruit', |
| | 'Mangosteen.'] |
+---------+-----------------------------------------------------------+
Constants: None
Starfruit wins.
Note
We had to set seed
to a different value in each GenerateText
instance to disable local caching. Otherwise, we would get the same answer 5 times.
static_loop.plot_program()
2.3.2. Dynamic loops#
If you want to loop over all results of a previous layer, you can do this with a ForEach
component.
fruits = ExtractRegex(
GenerateText(
"Generate a list of 5 fruits. Wrap each fruit with <item> and </item>."
),
r"<item>(.*?)<.?item>"
)
fruit_blurbs = Output(ForEach(
"fruit",
fruits,
GenerateText(Template("Why is {{fruit}} a good fruit in less than 25 words?")),
))
fruit_desc = fruit_blurbs.run(runner)
fruit_desc.outputs[0].plot_call_trace()
As an aside, this is a case where the static program graph looks different from the call trace. This is because we don’t know how many fruits will actually be generated until the LLM is called.
fruit_blurbs.plot_program()