Skip to content

Business card scanner

This guide shows how to use vision and image variables to scan business card information in a structured format.

Vision model

You will need access to a deployment of the OpenAI vision model. In this example, it is identifier by gpt-4o. Also set the maxTokens to 4000 to ensure the model can process the entire business card.

script({
...
model: "openai:gpt-4o",
maxTokens: 4000,
})

defImage

The defImage function can be used to input multiple files to the script. The non-image files will automatically be ignored, so you can typically pass env.files directly to defImages.

defImages(env.files)

Producing CSV

All together the script looks like the following:

scan-business-card.genai.mjs
script({
description: "Given an image of business card, extract the details to a csv file",
group: "vision",
model: "vision",
maxTokens: 4000,
})
defImages(env.files)
const outputName = path.join(path.dirname(env.files[0].filename), "card.csv")
$`You are a helpful assistant. You are given an image of a business
card. Extract the following information in ${outputName}:
Name, Address, Phone, Email, Company, Title, Website, Category of Business
If you can't infer the category, mark it as "Unknown"`

Using a schema

We can add data format validation by adding a schema for the business data rows.

const schema = defSchema("EXPENSE", {
type: "array",
items: {
type: "object",
properties: {
Date: { type: "string" },
Location: { type: "string" },
Total: { type: "number" },
Tax: { type: "number" },
Item: { type: "string" },
ExpenseCategory: { type: "string" },
Quantity: { type: "number" },
},
required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"],
},
})

And the script above is adapter to use the schema instead of the CSV description.

scan-business-card.genai.mjs
script({
description:
"Given an image of a receipt, extract a csv of the receipt data",
group: "vision",
model: "vision",
maxTokens: 4000,
})
defImages(env.files)
const schema = defSchema("EXPENSE", {
type: "array",
items: {
type: "object",
properties: {
Date: { type: "string" },
Location: { type: "string" },
Total: { type: "number" },
Tax: { type: "number" },
Item: { type: "string" },
ExpenseCategory: { type: "string" },
Quantity: { type: "number" },
},
required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"],
},
})
const outputName = path.join(path.dirname(env.files[0].filename), "items.csv")
$`You are a helpful assistant that is an expert in filing expense reports.
You have information from a receipt in RECEIPT and you need to put the data
in ${outputName} using the ${schema} schema.`