r/LocalLLaMA - Reddit

reddit.com

RelatedHighlights

eneral-purpose models

1.1B: TinyDolphin 2.8 1.1B. Takes about ~700MB RAM and tested on my Pi 4 with 2 gigs of RAM. Hallucinates a lot, but works for basic conversation.

2.7B: Dolphin 2.6 Phi-2. Takes over ~2GB RAM and tested on my 3GB 32-bit phone via llama.cpp on Termux.

7B: Nous Hermes Mistral 7B DPO. Takes about ~4-5GB RAM depending on contex

r/LocalLLaMA - Reddit

Nicolay Gerold added

Supported Models

Suggest Edits

Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.

Model

Context Length

Model Type

codellama-34b-instruct

16384

Chat Completion

llama-2-70b-chat

4096

Chat Completion

mistral-7b-instruct

4096 [1]

Chat Completion

pplx-7b-c... See more

Supported Models

Nicolay Gerold added

Models by perplexitiy, among other their online model with access to the internet.

1. Meta Quietly Unveils Llama 2 Long AI That Beats GPT-3.5 Turbo and Claude 2 on Some Tasks

Meta is releasing Llama 2 Long, an enhanced version of Llama 2 that underwent continual pretraining with longer training sequences and upsampled long texts. By adding 400 billion tokens and making minor changes to the Rotary Positional Embedding (RoPE), Llama... See more

This AI newsletter is all you need #68

Nicolay Gerold added

Ollama

Get up and running with large language models locally.

macOS

Download

Windows

Coming soon!

Linux & WSL2

curl https://ollama.ai/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Quickstart

To run and chat with Llama 2:

ollama run llama2

Model library

Ollama supports a lis... See more

jmorganca • GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally

Nicolay Gerold added

https://github.com/huggingface/chat-ui - Amazing clean UI with very good web search, my go to currently. (they added the ability to do it all locally very recently!)

https://github.com/oobabooga/text-generation-webui - Best overall, supports any model format and has many extensions

https://github.com/ParisNeo/lollms-webui/ - Has PDF, stable diffusion... See more

r/LocalLLaMA - Reddit

Nicolay Gerold added

promptfoo is a tool for testing and evaluating LLM output quality.

With promptfoo, you can:

Systematically test prompts & models against predefined test cases

Evaluate quality and catch regressions by comparing LLM outputs side-by-side

Speed up evaluations with caching and concurrency

Score outputs automatically by defining test cases

Use as a

Testing framework for LLM Part

Nicolay Gerold added

Llama 2 - Resource Overview - Meta AI

ai.meta.com

Israel and added

Dynamically route every prompt to the best LLM. Highest performance, lowest costs, incredibly easy to use.

There are over 250,000 LLMs today. Some are good at coding. Some are good at holding conversations. Some are up to 300x cheaper than others. You could hire an ML engineering team to test every single one — or you can switch to the best one fo

Testing framework for LLM Part

Nicolay Gerold added