This tool compares the performance of several OCR/PDF reader services and leverages Claude and ChatGPT's Assistant API to parse extracted text and structure it according to a provided schema.
Before you begin, create a .env file in the root directory with the following environment variables:
AZURE_READ_ENDPOINT=your_azure_endpoint
AZURE_READ_KEY=your_azure_key
PDF_SERVICES_CLIENT_ID=your_adobe_client_id
PDF_SERVICES_CLIENT_SECRET=your_adobe_client_secret
GOOGLE_PROJECT_ID=your_project_id
GOOGLE_LOCATION=your_location
GOOGLE_PROCESSOR_ID=your_processor_id
GOOGLE_STATEMENT_PROCESSOR_ID=your_statement_processor_id
GOOGLE_KEY_FILE_PATH=path_to_your_service_account_key_file
AWS_REGION=your_aws_region
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
OPENAI_API_KEY=your_openai_key
OPENAI_ASSISTANT_ID=your_assistant_id
ANTHROPIC_API_KEY=your_anthropic_key
- Create an Azure account.
- Create an Azure AI Document Intelligence resource.
- Retrieve the Endpoint and Key from the Keys and Endpoint section in the Azure portal.
- Register for Adobe PDF Services.
- Create a new project and retrieve the Client ID and Client Secret from your project credentials.
- Create a Google Cloud account.
- Enable the Document AI API.
- Create processors for general Documents AI model and their specific bank statements model (that's what I've tried for parsing transactions statements (wasn't that good). You can use any other processor for different types of documents).
- Set up a service account, download the credentials JSON, and place it in your project root.
- Create an AWS account.
- Create an IAM user with Textract access (documentation).
- Generate the Access Key and Secret Key.
- Choose your preferred AWS Region.
- Sign up for OpenAI.
- Generate an API key.
- Create an Assistant:
- Visit the OpenAI Platform.
- Navigate to Assistants.
- Create a new assistant and configure it for parsing OCR output.
- Copy the Assistant ID.
- Sign up for Anthropic.
- Generate an API key.
- Note: This implementation uses the Claude-3 Haiku model for cost-effectiveness while maintaining high accuracy for structured data extraction.
- Also note: You can see the example of the response I've prompted Claude to return in the
app/api/services/claudeService.tsfile. Change it to fit your needs.
This is a Next.js project, bootstrapped with create-next-app.
To run the development server, execute one of the following commands:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOnce the server is running, open http://localhost:3000 in your browser to view the app.
You can edit the page by modifying app/page.tsx. The page will auto-update as you make changes.
This project uses next/font to optimize and load the Geist font for a modern look and feel.
To dive deeper into Next.js, explore the following resources:
- Next.js Documentation – Learn about Next.js features and APIs.
- Learn Next.js – An interactive tutorial to master Next.js.
Check out the Next.js GitHub repository for contributions, issues, and the community.
The easiest way to deploy your Next.js app is via the Vercel Platform.
For detailed deployment instructions, refer to the Next.js Deployment Documentation.
