Business card scanner

This guide shows how to use vision and image variables to scan business card information in a structured format.

Vision model

You will need access to a deployment of the OpenAI vision model. In this example, it is identifier by gpt-4o. Also set the maxTokens to 4000 to ensure the model can process the entire business card.

script({
    ...
    model: "openai:gpt-4o",
    maxTokens: 4000,
})

`defImage`

The defImage function can be used to input multiple files to the script. The non-image files will automatically be ignored, so you can typically pass env.files directly to defImages.

defImages(env.files)

Producing CSV

All together the script looks like the following:

script({
    description: "Given an image of business card, extract the details to a csv file",
    group: "vision",
    model: "vision",
    maxTokens: 4000,
})
defImages(env.files)

const outputName = path.join(path.dirname(env.files[0].filename), "card.csv")

$`You are a helpful assistant.  You are given an image of a business
card.  Extract the following information in ${outputName}:

   Name, Address, Phone, Email, Company, Title, Website, Category of Business

If you can't infer the category, mark it as "Unknown"`

Using a schema

We can add data format validation by adding a schema for the business data rows.

const schema = defSchema("EXPENSE", {
    type: "array",
    items: {
        type: "object",
        properties: {
            Date: { type: "string" },
            Location: { type: "string" },
            Total: { type: "number" },
            Tax: { type: "number" },
            Item: { type: "string" },
            ExpenseCategory: { type: "string" },
            Quantity: { type: "number" },
        },
        required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"],
    },
})

And the script above is adapter to use the schema instead of the CSV description.

script({
    description:
        "Given an image of a receipt, extract a csv of the receipt data",
    group: "vision",
    model: "vision",
    maxTokens: 4000,
})
defImages(env.files)
const schema = defSchema("EXPENSE", {
    type: "array",
    items: {
        type: "object",
        properties: {
            Date: { type: "string" },
            Location: { type: "string" },
            Total: { type: "number" },
            Tax: { type: "number" },
            Item: { type: "string" },
            ExpenseCategory: { type: "string" },
            Quantity: { type: "number" },
        },
        required: ["Date", "Location", "Total", "Tax", "Item", "Quantity"],
    },
})

const outputName = path.join(path.dirname(env.files[0].filename), "items.csv")

$`You are a helpful assistant that is an expert in filing expense reports.
You have information from a receipt in RECEIPT and you need to put the data
in ${outputName} using the ${schema} schema.`