Examples Catalog¶
Want to Contribute?
We welcome contributions to the examples catalog! Please refer to the Contributing guide for more details.
-
APO room selector
Prompt-optimize a room-booking agent with the built-in APO algorithm, then contrast it with the write-your-own algorithm and debugging workflows in the tutorials. Pairs well with the Train the First Agent how-to and the Write the First Algorithm guide.
-
Azure OpenAI SFT
Run a supervised fine-tuning loop against Azure OpenAI: roll out the capital-lookup agent, turn traces into JSONL, launch fine-tunes, and redeploy the resulting checkpoints through Azure CLI.
-
Calc-X VERL math
VERL-based reinforcement learning setup for a math-reasoning agent that uses AutoGen plus an MCP calculator tool to solve Calc-X problems end to end.
-
Claude Code SWE-bench
Instrumented driver that runs Anthropic's Claude Code workflow on SWE-bench instances while streaming traces through Agent-lightning—supports hosted vLLM, official Anthropic, or any OpenAI-compatible backend and emits datasets for downstream tuning.
-
Minimal building blocks
Bite-sized scripts that isolate Agent-lightning primitives (e.g., LightningStore usage, LLM proxying, minimal vLLM host) so you can study each part before composing larger workflows.
-
RAG (MuSiQue)
Retrieval-Augmented Generation pipeline that preps a Wikipedia retriever via MCP and trains a MuSiQue QA agent with GRPO. Documented for historical reference (verified on Agent-lightning v0.1.x).
-
Search-R1 RL
Reproduction of the Search-R1 workflow that prepares its own retrieval backend, runs the rollout script, and coordinates GRPO-style training without extra orchestration layers (last validated on v0.1.x).
-
Spider SQL agent
LangGraph-powered text-to-SQL workflow for the Spider benchmark, combining LangChain tooling with Agent-lightning rollouts; follow along with the how-to for training SQL agents.
-
Tinker integration
Adapter package (
agl_tinker) with Tinker plus sample CrewAI/OpenAI agents that feed Agent-lightning traces into Tinker’s reinforcement-learning backend for both toy and 20-Questions-style workflows. -
Unsloth SFT
Supervised fine-tuning loop that ranks math-agent rollouts, fine-tunes with Unsloth’s 4-bit LoRA stack, and mirrors the Fine-tune with Unsloth recipe.