Sublime
An inspiration engine for ideas
PDFs are satan’s file format.
Almost everyone that builds RAG needs to deal with them - and it sucks.
Solutions on the market are either too slow, too expensive or not OSS.
It should be easier. Which is why we’re open sourcing https://t.co/0gCZxzbkWu
Ishaan Kapoorx.comDataTrove
DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality.
DataTrove processing pipelines are platform-agnostic, running out of the box locally or on a slurm cluster. Its (relatively) low memory... See more
DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality.
DataTrove processing pipelines are platform-agnostic, running out of the box locally or on a slurm cluster. Its (relatively) low memory... See more
huggingface • GitHub - huggingface/datatrove: Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
It is forbidden to mix a small amount of [inferior] produce into a large amount of superior produce so that [the inferior produce] will not be recognized95 and [then] sell the entire quantity under the presumption that it is [all] of higher quality.96
Sichos In English • Shulchan Aruch of Rabbi Shneur Zalman of Liadi, Volume 12: Choshen Mishpat
QUESO BUBI
The recipe for a delicious and unique cheese called Queso Bubi, made with cream cheese, goat cheese, marmalade, nuts, and other ingredients.
LinkThe soupy mass of crushed grapes, juice, skins, pulp, seeds, and possibly stems is called the must.