Browser Automation
GenAIScript provides a simplified API to interact with a headless browser using Playwright . This allows you to interact with web pages, scrape data, and automate tasks.
const page = await host.browse( "https://github.com/microsoft/genaiscript/blob/main/packages/sample/src/penguins.csv")const table = page.locator('table[data-testid="csv-table"]')const csv = parsers.HTMLToMarkdown(await table.innerHTML())def("DATA", csv)$`Analyze DATA.`
Installation
Playwright needs to install the browsers and dependencies before execution. GenAIScript will automatically try to install them if it fails to load the browser. However, you can also do it manually using the following command:
npx playwright install --with-deps chromium
If you see this error message, you might have to install the dependencies manually.
╔═════════════════════════════════════════════════════════════════════════╗║ Looks like Playwright Test or Playwright was just installed or updated. ║║ Please run the following command to download new browsers: ║║ ║║ yarn playwright install ║║ ║║ <3 Playwright Team ║╚═════════════════════════════════════════════════════════════════════════╝
host.browse
This function launches a new browser instance and optionally navigates to a page. The pages are automatically closed when the script ends.
const page = await host.browse(url)
`incognito“
Setting incognito: true
will create a isolated non-persistent browser context. Non-persistent browser contexts don’t write any browsing data to disk.
const page = await host.browse(url, { incognito: true })
recordVideo
Playwright can record a video of each page in the browser session. You can enable it by passing the recordVideo
option.
Recording video also implies incognito
mode as it requires creating a new browsing context.
const page = await host.browse(url, { recordVideo: true })
By default, the video size will be 800x600 but you can change it by passing the sizes as the recordVideo
option.
const page = await host.browse(url, { recordVideo: { width: 500, height: 500 },})
The video will be saved in a temporary directory under .genaiscript/videos/<timestamp>/
once the page is closed.
You need to close the page before accessing the video file.
await page.close()const videoPath = await page.video().path()
The video file can be further processed using video tools.
connectOverCDP
You can provide an enpoint that uses the Chrome DevTools Protocol using the connectOverCDP
.
const page = await host.browse(url, { connectOverCDP: "endpointurl" })
Locators
You can select elements on the page using the page.get...
or page.locator
method.
// select by Aria rolesconst button = page.getByRole("button")// select by test-idconst table = page.getByTestId("csv-table")
Element contents
You can access innerHTML
, innerText
, value
and textContent
of an element.
const table = page.getByTestId("csv-table")const html = table.innerHTML() // without the outer <table> tags!const text = table.innerText()const value = page.getByRole("input").value()
You can use the parsers in HTML to convert the HTML to Markdown.
const md = await HTML.convertToMarkdown(html)const text = await HTML.convertToText(html)const tables = await HTML.convertTablesToJSON(html)
Screenshot
You can take a screenshot of the current page or a locator and use it with vision-enabled LLM (like gpt-4o
) using defImages
.
const screenshot = await page.screenshot() // returns a node.js BufferdefImages(screenshot)
(Advanced) Native Playwright APIs
The page
instance returned is a native Playwright Page object.
You can import playwright
and cast the instance back to the native Playwright object.
import { Page } from "playwright"
const page = await host.browse(url) as Page