Search and transform

Search And Replace is a powerful tool in the developer toolbelt that can save you time and effort… if you can formulate the right regular expression.

Search and Transform is a twist on the same concept but we use an LLM to perform the transformation instead of a simple string replacement.

👩‍💻 Understanding the Script Code

script({
    title: "Search and transform",
    description:
        "Search for a pattern in files and apply an LLM transformation to the match",
    parameters: {
        glob: {
            type: "string",
            description: "The glob pattern to filter files",
            default: "*",
        },
        pattern: {
            type: "string",
            description: "The text pattern (regular expression) to search for",
        },
        transform: {
            type: "string",
            description: "The LLM transformation to apply to the match",
        },
    },
})

The script starts by defining its purpose and parameters using the script function. Here, we define the title, description, and the three parameters the script will need: glob to specify the files, pattern for the text to search for, and transform for the desired transformation.

Extracting and Validating Parameters

const { pattern, glob, transform } = env.vars
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")

if (!transform) cancel("transform is missing")

Next, we extract the pattern, glob, and transform parameters from the environment variables and validate them. If pattern or transform are missing, the script will cancel execution. We then compile the pattern into a regular expression object for later use.

Searching for Files and Matches

const { files } = await workspace.grep(patternRx, glob)

Here, we use the grep function from the workspace API to search for files that match the glob pattern and contain the regex pattern.

Transforming Matches

// cached computed transformations
const patches = {}
for (const file of files) {
    console.log(file.filename)
    const { content } = await workspace.readText(file.filename)
    // skip binary files
    if (!content) continue
    // compute transforms
    for (const match of content.matchAll(patternRx)) {
        console.log(`  ${match[0]}`)
        if (patches[match[0]]) continue

We initialize an object called patches to store the transformations. Then, we loop through each file, read its content, and skip binary files. For each match found in the file’s content, we check if we’ve already computed a transformation for this match to avoid redundant work.

Generating Prompts for Transformations

const res = await runPrompt(
    (_) => {
        _.$`
            ## Task

            Your task is to transform the MATCH using the following TRANSFORM.
            Return the transformed text.
            - do NOT add enclosing quotes.

            ## Context
            `
        _.def("MATCHED", match[0])
        _.def("TRANSFORM", transform)
    },
    { label: match[0], system: [], cache: "search-and-transform" }
)

For each unique match, we generate a prompt using the runPrompt function. In the prompt, we define the task and context for the transformation, specifying that the transformed text should be returned without enclosing quotes. We also define the matched text and the transformation to apply.

Applying the Transformation

        const transformed = res.fences?.[0].content ?? res.text
        if (transformed) patches[match[0]] = transformed
        console.log(`  ${match[0]} -> ${transformed ?? "?"}`)
    }
    // apply transforms
    const newContent = content.replace(
        patternRx,
        (match) => patches[match] ?? match
    )

We then extract the transformed text from the prompt result and store it in the patches object. Finally, we apply the transformations to the file content using String.prototype.replace.

Saving the Changes

    if (content !== newContent)
        await workspace.writeText(file.filename, newContent)
}

If the file content has changed after applying the transformations, we save the updated content back to the file.

Running the Script

To run this script, you’ll need the GenAIScript CLI. Check out the installation guide if you need to set it up. Once you have the CLI, run the script by executing:

genaiscript run st

Full source (GitHub)

script({
    title: "Search and transform",
    description:
        "Search for a pattern in files and apply a LLM transformation the match",
    parameters: {
        glob: {
            type: "string",
            description: "The glob pattern to filter files",
        },
        pattern: {
            type: "string",
            description: "The text pattern (regular expression) to search for",
        },
        transform: {
            type: "string",
            description: "The LLM transformation to apply to the match",
        },
    },
})

let { pattern, glob, transform } = env.vars
if (!glob)
    glob =
        (await host.input(
            "Enter the glob pattern to filter files (default: *)"
        )) || "*"
if (!pattern)
    pattern = await host.input(
        "Enter the pattern to search for (regular expression)"
    )
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")

if (!transform)
    transform = await host.input(
        "Enter the LLM transformation to apply to the match"
    )
if (!transform) cancel("transform is missing")

const { files } = await workspace.grep(patternRx, { glob })
// cached computed transformations
const patches = {}
for (const file of files) {
    console.log(file.filename)
    const { content } = await workspace.readText(file.filename)

    // skip binary files
    if (!content) continue

    // compute transforms
    for (const match of content.matchAll(patternRx)) {
        console.log(`  ${match[0]}`)
        if (patches[match[0]]) continue

        const res = await runPrompt(
            (_) => {
                _.$`
            ## Task

            Your task is to transform the MATCH with the following TRANSFORM.
            Return the transformed text.
            - do NOT add enclosing quotes.

            ## Context
            `
                _.def("MATCHED", match[0])
                _.def("TRANSFORM", transform, {
                    detectPromptInjection: "available",
                })
            },
            {
                label: match[0],
                system: [
                    "system.assistant",
                    "system.safety_jailbreak",
                    "system.safety_harmful_content",
                ],
                cache: "search-and-transform",
            }
        )

        const transformed = res.fences?.[0].content ?? res.text
        if (transformed) patches[match[0]] = transformed
        console.log(`  ${match[0]} -> ${transformed ?? "?"}`)
    }

    // apply transforms
    const newContent = content.replace(
        patternRx,
        (match) => patches[match] ?? match
    )

    // save results if file content is modified
    if (content !== newContent)
        await workspace.writeText(file.filename, newContent)
}

Content Safety

The following measures are taken to ensure the safety of the generated content.

This script includes system prompts to prevent prompt injection and harmful content generation.
- system.safety_jailbreak
- system.safety_harmful_content
The generated description is saved to a file at a specific path, which allows for a manual review before committing the changes.

Additional measures to further enhance safety would be to run a model with a safety filter or validate the message with a content safety service.

Refer to the Transparency Note for more information on content safety.