GitHub - predibase/lorax: Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks
mit-han-labgithub.com
Technology Radar
An opinionated guide to technology trends and tools, including techniques, platforms, and languages, with recommendations for adoption, trials, assessments, and holds, for enhanced software development practices.
thoughtworks.com