Data Loading

Bill Mill notes.billmill.org

Nicolay Gerold added 4mo

GitHub - VikParuchuri/surya: OCR, layout analysis, reading order, line detection in 90+ languages

Nicolay Gerold added 5mo

Stability and scalability for search

Nicolay Gerold added 6mo

tensorlakeai GitHub - tensorlakeai/indexify: A scalable realtime and continuous indexing engine for Unstructured Data to build Generative AI Applications

Nicolay Gerold added 6mo

GitHub - Stirling-Tools/Stirling-PDF: #1 Locally hosted web application that allows you to perform various operations on PDF files

Nicolay Gerold added 7mo

Bap Our 5 favourite open-source customer data platforms

Nicolay Gerold added 7mo

Bap Our 5 favourite open-source customer data platforms

Nicolay Gerold added 7mo

jina-ai jina-ai/reader: Convert any URL to an LLM-friendly input ... - GitHub

Nicolay Gerold added 7mo

Filimoa GitHub - Filimoa/open-parse: Improved file parsing for LLM’s

Nicolay Gerold added 7mo

CambioML GitHub - CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering: LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster...

Nicolay Gerold added 8mo