GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini

Extract reliable PDF text at 1/32 GPT-4o cost using a 7B VLM (Fully open-source)
Allen Institute for AI introduced an open-source OCR toolkit called olmOCR that extracts plain text from PDFs at over 3000 tokens/s for about 190 USD per million pages, or 1/32 GPT-4o cost—significant for large-scale document... See more
PDF parsing is still painful because LLMs reorder text in complex layouts, break tables across pages, and fail on graphs or images.
💡Testing the new open-source OCRFlux model, and here the results are really good for a change.
So OCRFlux is a multimodal, LLM based toolkit for converting PDFs... See more
Rohan Paulx.com