DeepSpeed-FastGen

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-labgithub.com
Thumbnail of GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks
Jim Fanx.com

GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

infiniflowgithub.com
Thumbnail of GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.