Supported Models

docs.perplexity.ai

RelatedHighlights

Llama 2 - Resource Overview - Meta AI

ai.meta.com

What We Learned From a Year of Building With LLMs

Bryan Bischof oreilly.com

Advances in data processing techniques. You can increase context length in two ways. First, you can train the model with longer context lengths. That’s difficult because it’s much more computationally expensive, and it’s hard to find datasets with long context lengths (most documents in CommonCrawl have fewer than 2,000 tokens).

The second, more com... See more

Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]

Model Description

NSQL is a family of autoregressive open-source large foundation models (FMs) designed specifically for SQL generation tasks.

In this repository we are introducing a new member of NSQL, NSQL-Llama-2-7B. It's based on Meta's original Llama-2 7B model and further pre-trained on a dataset of general SQL queries and then fine-tuned on a ... See more

NumbersStation/nsql-llama-2-7B · Hugging Face

Stable Beluga 2

Use Stable Chat (Research Preview) to test Stability AI's best language models for free

Model Description

Stable Beluga 2 is a Llama2 70B model finetuned on an Orca style Dataset

stabilityai/StableBeluga2 · Hugging Face

eneral-purpose models

1.1B: TinyDolphin 2.8 1.1B. Takes about ~700MB RAM and tested on my Pi 4 with 2 gigs of RAM. Hallucinates a lot, but works for basic conversation.

2.7B: Dolphin 2.6 Phi-2. Takes over ~2GB RAM and tested on my 3GB 32-bit phone via llama.cpp on Termux.

7B: Nous Hermes Mistral 7B DPO. Takes about ~4-5GB RAM depending on contex

r/LocalLLaMA - Reddit

TL;DR

LLMLingua utilizes a compact, well-trained language model (e.g., GPT2-small, LLaMA-7B) to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models (LLMs), achieving up to 20x compression with minimal performance loss.

LLMLingua: Compressing Prompts for Accelerated Inference of La

microsoft • GitHub - microsoft/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

LLM Pro/Serious Use Comparison/Test: From 7B to 70B vs. ChatGPT! Winner: Synthia-70B-v1.2b

LLM Chat/RP Comparison/Test: Dolphin-Mistral, Mistral-OpenOrca, Synthia 7B Winner: Mistral-7B-OpenOrca

LLM Chat/RP Comparison/Test: Mistral 7B Base + Instruct

LLM Chat/RP Comparison/Test (Euryale, FashionGPT, MXLewd, Synthia, Xwin) Winner: Xwin-LM-70B-V0.1

New Mo