Business card scanner
This guide shows how to use vision and image variables to scan business card information in a structured format.
Vision model
You will need access to a deployment of the OpenAI vision model. In this example, it is identifier by gpt-4o
.
Also set the maxTokens
to 4000 to ensure the model can process the entire business card.
script({ ... model: "openai:gpt-4o", maxTokens: 4000,})
defImage
The defImage function can be used to input multiple files to the script.
The non-image files will automatically be ignored, so you can typically pass env.files directly to defImages
.
defImages(env.files)
Producing CSV
All together the script looks like the following:
script({ description: "Given an image of business card, extract the details to a csv file", group: "vision", model: "vision", maxTokens: 4000,})defImages(env.files)
const outputName = path.join(path.dirname(env.files[0].filename), "card.csv")
$`You are a helpful assistant. You are given an image of a businesscard. Extract the following information in ${outputName}:
Name, Address, Phone, Email, Company, Title, Website, Category of Business
If you can't infer the category, mark it as "Unknown"`
Using a schema
We can add data format validation by adding a schema for the business data rows.
const schema = defSchema("EXPENSE", { type: "array", items: { type: "object", properties: { Date: { type: "string" }, Location: { type: "string" }, Total: { type: "number" }, Tax: { type: "number" }, Item: { type: "string" }, ExpenseCategory: { type: "string" }, Quantity: { type: "number" }, }, required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"], },})
And the script above is adapter to use the schema instead of the CSV description.
script({ description: "Given an image of a receipt, extract a csv of the receipt data", group: "vision", model: "vision", maxTokens: 4000,})defImages(env.files)const schema = defSchema("EXPENSE", { type: "array", items: { type: "object", properties: { Date: { type: "string" }, Location: { type: "string" }, Total: { type: "number" }, Tax: { type: "number" }, Item: { type: "string" }, ExpenseCategory: { type: "string" }, Quantity: { type: "number" }, }, required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"], },})
const outputName = path.join(path.dirname(env.files[0].filename), "items.csv")
$`You are a helpful assistant that is an expert in filing expense reports.You have information from a receipt in RECEIPT and you need to put the datain ${outputName} using the ${schema} schema.`