updated 2mo ago
GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini
- Zerox OCR
A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense!
The general logic:- Pass in a PDF (URL or file buffer)
- Turn the PDF into a series of images
- Pass each image to GPT and ask nicely for Markdown
- Aggregat
from GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini by Tyler Maran
Nicolay Gerold added 2mo ago