r/LocalLLaMA - Reddit
eneral-purpose models
- 1.1B: TinyDolphin 2.8 1.1B. Takes about ~700MB RAM and tested on my Pi 4 with 2 gigs of RAM. Hallucinates a lot, but works for basic conversation.
- 2.7B: Dolphin 2.6 Phi-2. Takes over ~2GB RAM and tested on my 3GB 32-bit phone via llama.cpp on Termux.
- 7B: Nous Hermes Mistral 7B DPO. Takes about ~4-5GB RAM depending on contex
r/LocalLLaMA - Reddit
Nicolay Gerold added
Supported Models
Suggest Edits
Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.
Model
Context Length
Model Type
codellama-34b-instruct
16384
Chat Completion
llama-2-70b-chat
4096
Chat Completion
mistral-7b-instruct
4096 [1]
Chat Completion
pplx-7b-c... See more
Suggest Edits
Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.
Model
Context Length
Model Type
codellama-34b-instruct
16384
Chat Completion
llama-2-70b-chat
4096
Chat Completion
mistral-7b-instruct
4096 [1]
Chat Completion
pplx-7b-c... See more
Supported Models
Nicolay Gerold added
Models by perplexitiy, among other their online model with access to the internet.
1. Meta Quietly Unveils Llama 2 Long AI That Beats GPT-3.5 Turbo and Claude 2 on Some Tasks
Meta is releasing Llama 2 Long, an enhanced version of Llama 2 that underwent continual pretraining with longer training sequences and upsampled long texts. By adding 400 billion tokens and making minor changes to the Rotary Positional Embedding (RoPE), Llama... See more
Meta is releasing Llama 2 Long, an enhanced version of Llama 2 that underwent continual pretraining with longer training sequences and upsampled long texts. By adding 400 billion tokens and making minor changes to the Rotary Positional Embedding (RoPE), Llama... See more
This AI newsletter is all you need #68
Nicolay Gerold added
Ollama
Get up and running with large language models locally.
macOS
Download
Windows
Coming soon!
Linux & WSL2
curl https://ollama.ai/install.sh | sh
Manual install instructions
Docker
The official Ollama Docker image ollama/ollama is available on Docker Hub.
Quickstart
To run and chat with Llama 2:
ollama run llama2
Model library
Ollama supports a lis... See more
Get up and running with large language models locally.
macOS
Download
Windows
Coming soon!
Linux & WSL2
curl https://ollama.ai/install.sh | sh
Manual install instructions
Docker
The official Ollama Docker image ollama/ollama is available on Docker Hub.
Quickstart
To run and chat with Llama 2:
ollama run llama2
Model library
Ollama supports a lis... See more
jmorganca • GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally
Nicolay Gerold added
https://github.com/huggingface/chat-ui - Amazing clean UI with very good web search, my go to currently. (they added the ability to do it all locally very recently!)
https://github.com/oobabooga/text-generation-webui - Best overall, supports any model format and has many extensions
https://github.com/ParisNeo/lollms-webui/ - Has PDF, stable diffusion... See more
https://github.com/oobabooga/text-generation-webui - Best overall, supports any model format and has many extensions
https://github.com/ParisNeo/lollms-webui/ - Has PDF, stable diffusion... See more
r/LocalLLaMA - Reddit
Nicolay Gerold added
promptfoo is a tool for testing and evaluating LLM output quality.... See more
With promptfoo, you can:
Systematically test prompts & models against predefined test cases
Evaluate quality and catch regressions by comparing LLM outputs side-by-side
Speed up evaluations with caching and concurrency
Score outputs automatically by defining test cases
Use as a
Testing framework for LLM Part
Nicolay Gerold added
Llama 2 - Resource Overview - Meta AI
ai.meta.comIsrael and added
Dynamically route every prompt to the best LLM. Highest performance, lowest costs, incredibly easy to use.... See more
There are over 250,000 LLMs today. Some are good at coding. Some are good at holding conversations. Some are up to 300x cheaper than others. You could hire an ML engineering team to test every single one — or you can switch to the best one fo
Testing framework for LLM Part
Nicolay Gerold added