Parallelization

4. Parallelization#

SAMMO automatically parallelizes runs across all rows of input data.

Hide code cell source
# %load -r 3:18 _init.py
import pathlib
import sammo
from sammo.runners import OpenAIChat
from sammo.base import Template, EvaluationScore
from sammo.components import Output, GenerateText, ForEach, Union
from sammo.extractors import ExtractRegex
from sammo.data import DataTable
import json
import requests
import os

if not "OPENAI_API_KEY" in os.environ:
    raise ValueError("Please set the environment variable 'OPENAI_API_KEY'.")

_ = sammo.setup_logger("WARNING")  # we're only interested in warnings for now
runner = OpenAIChat(
    model_id="gpt-3.5-turbo",
    api_config={"api_key": os.environ["OPENAI_API_KEY"]},
    cache=os.getenv("CACHE_FILE", "cache.tsv"),
)
numbers = list(range(1,6))
spp = Output(GenerateText(Template("Output as a latin numeral: {{input}}")))
spp.run(runner, DataTable(numbers))
minibatches[#################################################################################]5/5[00:00<00:00, 333.33it/s]
+---------+----------+
| input   | output   |
+=========+==========+
| 1       | I        |
+---------+----------+
| 2       | II       |
+---------+----------+
| 3       | III      |
+---------+----------+
| 4       | IV       |
+---------+----------+
| 5       | V        |
+---------+----------+
Constants: None

Here, SAMMO automatically runs queries for the six inputs in parallel while adhering to query limits (by default, 2 queries per second). We can change this when constructing the runner. We can also skip constructing the DataTable and just pass the list directly.

runner = OpenAIChat(
    model_id="gpt-3.5-turbo",
    api_config={"api_key": os.environ["OPENAI_API_KEY"]},
    cache=os.getenv("CACHE_FILE", "cache.tsv"),
    rate_limit=6
)
spp.run(runner, DataTable(numbers))
minibatches[###################################################################################]5/5[00:00<??:??, 0.00it/s]
+---------+----------+
| input   | output   |
+=========+==========+
| 1       | I        |
+---------+----------+
| 2       | II       |
+---------+----------+
| 3       | III      |
+---------+----------+
| 4       | IV       |
+---------+----------+
| 5       | V        |
+---------+----------+
Constants: None

That’s it! More complex throttling options are covered under special topics.