GitHub - predibase/lorax: Multi-LoRA inference server that s...

GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

predibase github.com

RelatedHighlights

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-lab github.com

Self-Host DeepSeek with Ollama and Open WebUI

Jeremy noted.lol

Discover, Download, and Run Local LLMs

lmstudio.ai lmstudio.ai

Technology Radar

An opinionated guide to technology trends and tools, including techniques, platforms, and languages, with recommendations for adoption, trials, assessments, and holds, for enhanced software development practices.

thoughtworks.com

Scaling: The State of Play in AI

Ethan Mollick oneusefulthing.org