GitHub - varunshenoy/super-json-mode: Low latency JSON generation using LLMs ⚡️

Johnx.com
LlamaIndex 🦙x.com

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-labgithub.com
Thumbnail of GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks