GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

predibasegithub.com
Thumbnail of GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-labgithub.com
Thumbnail of GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

Self-Host DeepSeek with Ollama and Open WebUI

Jeremynoted.lol
Thumbnail of Self-Host DeepSeek with Ollama and Open WebUI

Discover, Download, and Run Local LLMs

lmstudio.ailmstudio.ai
Thumbnail of Discover, Download, and Run Local LLMs

Technology Radar

An opinionated guide to technology trends and tools, including techniques, platforms, and languages, with recommendations for adoption, trials, assessments, and holds, for enhanced software development practices.

thoughtworks.com

Scaling: The State of Play in AI

Ethan Mollickoneusefulthing.org
Thumbnail of Scaling: The State of Play in AI