LLMs
Zerox OCR
A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense!
The general logic:
A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense!
The general logic:
- Pass in a PDF (URL or file buffer)
- Turn the PDF into a series of images
- Pass each image to GPT and ask nicely for Markdown
- Aggregat
Tyler Maran • GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini
First time here? Go to our setup guide
Features
Features
- 🤖 Multiple model integrations: OpenAI, transformers, llama.cpp, exllama2, mamba
- 🖍️ Simple and powerful prompting primitives based on the Jinja templating engine
- 🚄 Multiple choices, type constraints and dynamic stopping
- ⚡ Fast regex-structured generation
- 🔥 Fast JSON generation following a JSON schema
outlines-dev • GitHub - outlines-dev/outlines: Neuro Symbolic Text Generation
Deploying a Generative AI model requires more than a VM with a GPU. It normally includes:
- Container Service : Most often Kubernetes to run LLM Serving solutions like Hugging Face Text Generation Inference or vLLM.
- Compute Resources : GPUs for running models, CPUs for management services
- Networking and DNS : Routing traffic to the appropriate
Understanding the Cost of Generative AI Models in Production
The quality of dataset is 95% of everything. The rest 5% is not to ruin it with bad parameters.
After 500+ LoRAs made, here is the secret
- Query the RAG anyway and let the LLM itself chose whether to use the the RAG context or its built in knowledge
- Query the RAG but only provide the result to the LLM if it meets some level of relevancy (ie embedding distance) to the question
- Run the LLM both on it's own and with the RAG response, use a heuristic (or another LLM) to pick the best answer
r/LocalLLaMA - Reddit
The xAI PromptIDE is an integrated development environment for prompt engineering and interpretability research. It accelerates prompt engineering through an SDK that allows implementing complex prompting techniques and rich analytics that visualize the network's outputs. We use it heavily in our continuous development of Grok.
PromptIDE
What’s the best way for an end user to organize and explore millions of latent space features?
I’ve found tens of thousands of interpretable features in my experiments, and frontier labs have demonstrated results with a thousand times more features in production-scale models. No doubt, as interpretability techniques advance, we’ll see feature maps... See more
I’ve found tens of thousands of interpretable features in my experiments, and frontier labs have demonstrated results with a thousand times more features in production-scale models. No doubt, as interpretability techniques advance, we’ll see feature maps... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Study finds RLHF reduces LLM creativity and output variety : A new research paper posted in /r/LocalLLaMA shows that while alignment techniques like RLHF reduce toxic and biased content, they also limit the creativity of large language models, even in contexts unrelated to safety.
.png?table=block&id=e222d02f-1d78-4887-8972-a958b1fbca65&spaceId=996f2b3b-deaa-4214-aedb-cbc914a1833e&width=1250&userId=&cache=v2)