LLMs

Zerox OCR

A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense!

The general logic:

Pass in a PDF (URL or file buffer)

Turn the PDF into a series of images

Pass each image to GPT and ask nicely for Markdown

Aggregat

Tyler Maran • GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini

Mem0: The Memory Layer for Personalized AI

Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI experiences across applications.

Note: The Mem0 repository now also includes the Embedchain project. We continue to maintain and support Embedchain ❤️. You can find the Embedchain codebase in the embedchai

GitHub - mem0ai/mem0: The memory layer for Personalized AI

You can think your way into solving a deterministic system, but you cannot think your way into solving a probabilistic system.

The first thing that I want to call out is that deterministic software has edge cases, while probabilistic software has long tails.

I find that a lot of junior folks try to really think hard about edge cases around... See more

Jason Liu • Tips for probabilistic software - jxnl.co

Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by

ANY

LLM of your choice, statistical methods, or NLP models that runs

locally on your machine

:

G-Eval

Summarization

Answer Relevancy

Faithfulness

Contextual Recall

Contextual Precision

RAGAS

Hallucination

Toxicity

Bias

etc.

GitHub - confident-ai/deepeval: The LLM Evaluation Framework

Principles for growable tools

There are three critical pieces to building a tool that can grow around its users over time.

Design around play . Sometimes I call this design around experimentation . Using the tool for day-to-day work should involve playing and experimenting with what’s possible with the tool. Whether that’s writing small programs to

Beyond customization: build tools that grow with us | thesephist.com

A core research interest of mine is imagining new kinds of interfaces to text documents that are made possible by modern AI and software. I think an interesting place to look for such ideas may be interface designs for reading and writing legal documents .

Legal document-wrangling tools have a handful of properties that make it fertile ground for... See more

Legal documents are pushing text interfaces forward | thesephist.com

Humans are bad at coming up with search queries. Humans are good at incrementally narrowing down options with a series of filters, and pointing where they want to go next. This seems obvious, but we keep building interfaces for finding information that look more like Google Search and less like a map.

All information tools have to give users some... See more

thesephist.com • Navigate, don't search

A new v0.4.0 release of lm-evaluation-harness is available !

New updates and features include:

Internal refactoring

Config-based task creation and configuration

Easier import and sharing of externally-defined task config YAMLs

Support for Jinja2 prompt design, easy modification of prompts + prompt imports from Promptsource

More advanced configuration

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

Study finds RLHF reduces LLM creativity and output variety : A new research paper posted in /r/LocalLLaMA shows that while alignment techniques like RLHF reduce toxic and biased content, they also limit the creativity of large language models, even in contexts unrelated to safety.

Tyler Maran • GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini

GitHub - mem0ai/mem0: The memory layer for Personalized AI

Jason Liu • Tips for probabilistic software - jxnl.co

GitHub - confident-ai/deepeval: The LLM Evaluation Framework

Beyond customization: build tools that grow with us | thesephist.com

Legal documents are pushing text interfaces forward | thesephist.com

thesephist.com • Navigate, don't search

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]