Models

Cross-Encoder for Hallucination Detection

This model was trained using SentenceTransformers Cross-Encoder class.

The model outputs a probabilitity from 0 to 1, 0 being a hallucination and 1 being factually consistent.

The predictions can be thresholded at 0.5 to predict whether a document is consistent with its source.

Training Data

This model is... See more

vectara/hallucination_evaluation_model · Hugging Face

VAGO solutions SauerkrautLM

Introducing SauerkrautLM-v1 - Your German Language Powerhouse!

We are thrilled to unveil our very first release, SauerkrautLM-v1. This remarkable creation marks a significant milestone as it is specifically tailored for the German-speaking community. In a landscape where German language models are scarce, we are proud to... See more

VAGOsolutions/SauerkrautLM-70b-v1 · Hugging Face

Fuyu-8B Model Card

Note: Running Fuyu requires https://github.com/huggingface/transformers/pull/26911, which may require running transformers on main!

Model

Fuyu-8B is a multi-modal text and image transformer trained by Adept AI.

Architecturally, Fuyu is a vanilla decoder-only transformer - there is no image encoder.

Image patches are instead linearly... See more

adept/fuyu-8b · Hugging Face

Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-α is the first model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO). We found that removing the in-built alignment... See more

HuggingFaceH4/zephyr-7b-alpha · Hugging Face

Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more

Long-Context Retrieval Models with Monarch Mixer

ScaleCrafter is capable of generating images with resolution of 4096 x 4096 and results with resolution of 2048 x 1152 based on pre-trained diffusion models on a lower resolution. Notably, our approach needs no extra training/optimziation .

YingqingHe • GitHub - YingqingHe/ScaleCrafter: Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.

Qwen-14B is the 14B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-14B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-14B, we release Qwen-14B-Chat, a... See more

Qwen/Qwen-14B-Chat · Hugging Face

AI That Quacks: Introducing DuckDB-NSQL-7B, A LLM for DuckDB2024/01/25BY Till Döhmen and Jordan TiganiSubscribe to MotherDuck BlogE-mailAlso subscribe to other MotherDuck updatesSubmitWhat does a database have to do with AI, anyway?After a truly new technology arrives, it makes the future a lot harder to predict. The one thing you can be sure of is... See more

Till Döhmen • AI That Quacks: Introducing DuckDB-NSQL-7B, A LLM for DuckDB

HelixNet is a Deep Learning architecture consisting of 3 x Mistral-7B LLMs. It has an actor, a critic, and a regenerator. The actor LLM produces an initial response to a given system-context and a question. The critic then takes in as input, a tuple of (system-context, question, response) and provides a critique based on the provided answer to the... See more