Business card scanner
This guide shows how to use vision and image variables to scan business card information in a structured format.
Vision model
You will need access to a deployment of the OpenAI vision model. In this example, it is identifier by gpt-4o
.
Also set the maxTokens
to 4000 to ensure the model can process the entire business card.
defImage
The defImage function can be used to input multiple files to the script.
The non-image files will automatically be ignored, so you can typically pass env.files directly to defImages
.
Producing CSV
All together the script looks like the following:
Using a schema
We can add data format validation by adding a schema for the business data rows.
And the script above is adapter to use the schema instead of the CSV description.