Skip to content

Blog

LLM Agents

GenAIScript defines an agent as a tool that runs an inline prompt to accomplish a task. The agent LLM is typically augmented with additional tools.

In this blog post, we’ll walk through building a user interaction agent that enables the agent to ask questions to the user.

script({
tools: ["agent_user_input"],
})
$`
Imagine a funny question and ask the user to answer it.
From the answer, generate 3 possible answers and ask the user to select the correct one.
Ask the user if the answer is correct.
`

Let’s dive into understanding how to create an “Agent that can ask questions to the user.”

You can find the full script on GitHub right here.

Metadata

The script is written in JavaScript. It starts by declaring the metadata to make the script available as a system script, which can be reused in other scripts.

system.agent_user_input.genai.mjs
system({
title: "Agent that can ask questions to the user.",
})

This line sets up the title for our system, making it clear that it’s intended to interact with the user by asking questions.

title and description

The defAgent function defines the behavior of our agent. It takes an agent identifier and a description. These two are quite important, as they will help the “host” LLM choose to use this agent.

defAgent(
"user_input",
"Ask user for input to confirm, select or answer a question.",
...

GenAIScript will automatically append a description of all the tools used by the agent prompt so you don’t have to worry about that part in the description.

prompt

The third argument is a string or a function to craft prompt instructions for the agent LLM call. The agent implementation already contains generic prompting to make the prompt behave like an agent, but you can add more to specify a role, tone, and dos and don’ts.

defAgent(
...,
`You are an agent that can ask questions to the user and receive answers. Use the tools to interact with the user.
- the message should be very clear. Add context from the conversation as needed.`,
...

model configuration

The last argument is a set of model options, similar to runPrompt, to configure the LLM call made by the agent. In particular, this is where you list the tools that the agent can use.

defAgent(
..., {
tools: ["user_input"],
}
)

How to use the agent

The agent is used like any other tool by referencing it in the script options.

script({
tools: ["agent_user_input"]
})
...

Let’s try it!

Let’s try the agent with:

script({
tools: ["agent_user_input"],
})
$`Imagine a funny question and ask the user to answer it.
From the answer, generate 3 possible answers and ask the user to select the correct one.
Ask the user if the answer is correct.`

and let’s look at the results…

prompting openai:gpt-4o (~150 tokens)
agent user_input: What would be the most unexpected thing to find inside a refrigerator?
run prompt agent user_input
prompting openai:gpt-4o (~234 tokens)
user input text: What would be the most unexpected thing to find inside a refrigerator?

✔ What would be the most unexpected thing to find inside a refrigerator? toaster

prompting openai:gpt-4o (~240 tokens)
toaster
prompting openai:gpt-4o (~156 tokens)
agent user_input: Based on your answer, which of the following would also be unexpected to find inside a refrigerator?
1. A television
2. A penguin
3. A snowman
Please select the correct answer.
run prompt agent user_input
prompting openai:gpt-4o (~263 tokens)
user input select: Based on your answer, which of the following would also be unexpected to find inside a refrigerator?

✔ Based on your answer, which of the following would also be unexpected to find inside a refrigerator? A television

prompting openai:gpt-4o (~269 tokens)
A television
prompting openai:gpt-4o (~162 tokens)
agent user_input: Is your selection of 'A television' the correct unexpected item to find inside a refrigerator?
run prompt agent user_input
prompting openai:gpt-4o (~239 tokens)
user input confirm: Is your selection of 'A television' the correct unexpected item to find inside a refrigerator?

✔ Is your selection of ‘A television’ the correct unexpected item to find inside a refrigerator? yes

prompting openai:gpt-4o (~244 tokens)
true
prompting openai:gpt-4o (~167 tokens)
Great choice! A television inside a refrigerator would indeed be quite unexpected.

Search and Transform

Have you ever found yourself in a situation where you need to search through multiple files in your project, find a specific pattern, and then apply a transformation to it? It can be a tedious task, but fear not! In this blog post, I’ll walk you through a GenAIScript that does just that, automating the process and saving you time. 🕒💡

For example, when GenAIScript added the ability to use a string command string in the exec command, we needed to convert all script using

host.exec("cmd", ["arg0", "arg1", "arg2"])

to

host.exec(`cmd arg0 arg1 arg2`)`

The Search And Transform guide covers the detail on this new approach…

Automatic Web Page Content Analysis

In this blog post, we’ll dive into a practical example showcasing how to leverage GenAIScript for automatic web page content analysis. GenAIScript uses the playwright browser automation library which allows to load, interact and inspect web pages.

Step-by-Step Explanation of the Code

The following snippet provides a concise and effective way to analyze a web page’s content using GenAIScript:

const page = await host.browse("https://bing.com")
const screenshot = await page.screenshot()
defImages(screenshot, { maxWidth: 800 })
const text = parsers.HTMLtoMarkdown(await page.content())
def("PAGE_TEXT", text)
$`Analyze the content of the page and provide insights.`

Let’s break down what each line of this script does:

1. Navigating to a Web Page

const page = await host.browse("https://example.com")

This line automatically navigates to the specified URL (https://example.com). The host.browse function is a powerful feature of GenAIScript that initializes a browser session and returns a page object for further interactions.

2. Taking a Screenshot

const screenshot = await page.screenshot()

Here, the script captures a screenshot of the current view of the page. This is particularly useful for archiving or visual analysis.

3. Defining Images for Analysis

defImages(screenshot, { maxWidth: 800 })

After capturing the screenshot, this line registers the image for further analysis. defImages is a function that makes the screenshot available to subsequent analytical or AI-driven functions in the script.

4. Extracting Text Content

const text = parsers.HTMLtoMarkdown(await page.content())

This command extracts all text content from the page, which can be invaluable for content audits or textual analysis.

5. Storing Text for Further Use

def("PAGE_TEXT", text)

The extracted text is then stored under the identifier PAGE_TEXT, allowing it to be referenced in later parts of the script or for documentation purposes.

6. Analyzing the Content

$`Analyze the content of the page and provide insights.`

Finally, this line represents a call to an AI or script-defined function that analyzes the captured content and provides insights. This is where the real power of automation and AI integration into GenAIScript shines, enabling detailed analysis without manual intervention.

Conclusion

With a simple yet powerful script like the one discussed, GenAIScript makes it feasible to automate the process of web page content analysis. Whether you’re conducting competitive analysis, performing content audits, or simply archiving web pages, GenAIScript offers a scalable and efficient solution.