Search And Transform
This script is an evolution of the “search and replace” feature from text editor, where the “replace” step has been replaced by a LLM transformation.
It can be useful to batch apply text transformations that are not easily done with regular expressions.
For example, when GenAIScript added the ability to use a string command string in
the exec
command, we needed to convert all script using
host.exec("cmd", ["arg0", "arg1", "arg2"])
to
host.exec(`cmd arg0 arg1 arg2`)`
While it’s possible to match this function call with a regular expression
host\.exec\s*\([^,]+,\s*\[[^\]]+\]\s*\)
it’s not easy to formulate the replacement string… unless you can describe it in natural language:
Convert the call to a single string command shell in TypeScript
Here are some example of the transformations where the LLM correctly handled variables.
- concatenate the arguments of a function call into a single string
const { stdout } = await host.exec("git", ["diff"])const { stdout } = await host.exec(`git diff`)
- concatenate the arguments and use the
${}
syntax to interpolate variables
const { stdout: commits } = await host.exec("git", [ "log", "--author", author, "--until", until, "--format=oneline",])const { stdout: commits } = await host.exec(`git log --author ${author} --until ${until} --format=oneline`)
Search
The search step is done with the workspace.grep that allows to efficiently search for a pattern in files (this is the same search engine that powers the Visual Studio Code search).
const { pattern, globs } = env.varsconst patternRx = new RegExp(pattern, "g")const { files } = await workspace.grep(patternRx, { globs })
Compute Transforms
The second step is to apply the regular expression to the file content and pre-compute the LLM transformation of each match using an inline prompt.
const { transform } = env.vars...const patches = {} // map of match -> transformedfor (const file of files) { const { content } = await workspace.readText(file.filename) for (const match of content.matchAll(patternRx)) { const res = await runPrompt( (ctx) => { ctx.$` ## Task
Your task is to transform the MATCH with the following TRANSFORM. Return the transformed text. - do NOT add enclosing quotes.
## Context ` ctx.def("MATCHED", match[0]) ctx.def("TRANSFORM", transform) }, { label: match[0], system: [], cache: "search-and-transform" } ) ...
Since the LLM sometimes decides to wrap the answer in quotes, we need to remove them.
... const transformed = res.fences?.[0].content ?? res.text patches[match[0]] = transformed
Transform
Finally, with the transforms pre-computed, we apply a final regex replace to patch the old file content with the transformed strings.
const newContent = content.replace( patternRx, (match) => patches[match] ?? match ) await workspace.writeText(file.filename, newContent)}
Parameters
The script takes three parameters: a file glob, a pattern to search for, and a LLM transformation to apply.
We declare these parameters in the script
metadata and extract them from the env.vars
object.
script({ ..., parameters: { glob: { type: "string", description: "The glob pattern to filter files", default: "*", }, pattern: { type: "string", description: "The text pattern (regular expression) to search for", }, transform: { type: "string", description: "The LLM transformation to apply to the match", }, },})const { pattern, glob, transform } = env.vars
Full source
script({ title: "Search and transform", description: "Search for a pattern in files and apply a LLM transformation the match", parameters: { glob: { type: "string", description: "The glob pattern to filter files", }, pattern: { type: "string", description: "The text pattern (regular expression) to search for", }, transform: { type: "string", description: "The LLM transformation to apply to the match", }, },})
let { pattern, glob, transform } = env.varsif (!glob) glob = (await host.input( "Enter the glob pattern to filter files (default: *)" )) || "*"if (!pattern) pattern = await host.input( "Enter the pattern to search for (regular expression)" )if (!pattern) cancel("pattern is missing")const patternRx = new RegExp(pattern, "g")
if (!transform) transform = await host.input( "Enter the LLM transformation to apply to the match" )if (!transform) cancel("transform is missing")
const { files } = await workspace.grep(patternRx, { glob })// cached computed transformationsconst patches = {}for (const file of files) { console.log(file.filename) const { content } = await workspace.readText(file.filename)
// skip binary files if (!content) continue
// compute transforms for (const match of content.matchAll(patternRx)) { console.log(` ${match[0]}`) if (patches[match[0]]) continue
const res = await runPrompt( (_) => { _.$` ## Task
Your task is to transform the MATCH with the following TRANSFORM. Return the transformed text. - do NOT add enclosing quotes.
## Context ` _.def("MATCHED", match[0]) _.def("TRANSFORM", transform, { detectPromptInjection: "available", }) }, { label: match[0], system: [ "system.assistant", "system.safety_jailbreak", "system.safety_harmful_content", ], cache: "search-and-transform", } )
const transformed = res.fences?.[0].content ?? res.text if (transformed) patches[match[0]] = transformed console.log(` ${match[0]} -> ${transformed ?? "?"}`) }
// apply transforms const newContent = content.replace( patternRx, (match) => patches[match] ?? match )
// save results if file content is modified if (content !== newContent) await workspace.writeText(file.filename, newContent)}
To run this script, you can use the --vars
option to pass the pattern and the transform.
genaiscript st --vars 'pattern=host\.exec\s*\([^,]+,\s*\[[^\]]+\]\s*\)' 'transform=Convert the call to a single string command shell in TypeScript'