Rostlab/prot_bert · Hugging Face

RelatedHighlights

Model Card for Zephyr 7B β

Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removi... See more

HuggingFaceH4/zephyr-7b-beta · Hugging Face

Nicolay Gerold added

koboldcpp

🤗 Transformers

Huggingface is an open source platform and community for deep learning models for language, vision, audio and multimodal. They develop and maintain the transformers library, which simplifies the process of downloading and training state of the art deep learning models.

This is the best library if you have a background in m... See more

Moyi • 10 Ways To Run LLMs Locally And Which One Works Best For You

Nicolay Gerold added

Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more

Long-Context Retrieval Models with Monarch Mixer

Nicolay Gerold added

Protein

protein.xyz

Mo Shafieeha added

Although there are already many methods available for keyword generation (e.g., Rake, YAKE!, TF-IDF, etc.) I wanted to create a very basic, but powerful method for extracting keywords and keyphrases. This is where KeyBERT comes in! Which uses BERT-embeddings and simple cosine similarity to find the sub-phrases in a document that are the most simila... See more

MaartenGr • GitHub - MaartenGr/KeyBERT: Minimal keyword extraction with BERT

Nicolay Gerold added

Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-α is the first model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removing the in-built alignment o... See more

HuggingFaceH4/zephyr-7b-alpha · Hugging Face

Nicolay Gerold added

Welcome to RAGatouille

Easily use and train state of the art retrieval methods in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

The main motivation of RAGatouille is simple: bridging the gap between state-of-the-art research and alchemical RAG pipeline practices. RAG is complex, and there are many moving parts. To g... See more

GitHub - bclavie/RAGatouille: Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Nicolay Gerold added

ColBERT is a

fast

and

accurate

retrieval model, enabling scalable BERT-based search over large text collections in tens of milliseconds.

Figure 1: ColBERT's late interaction, efficiently scoring the fine-grained similarity between a queries and a passage.

As Figure 1 illustrates, ColBERT relies on fine-grained contextual late interaction : it encod... See more

stanford-futuredata • GitHub - stanford-futuredata/ColBERT: Stanford ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22)

Nicolay Gerold added