DeepSpeed-FastGen

google GitHub - google/maxtext: A simple, performant and scalable Jax LLM!

Ben Auffarth Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs

young-geng GitHub - young-geng/EasyLM: Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

microsoft GitHub - microsoft/LLMLingua: To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

mit-han-lab GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

Darren LI added

ghimiresunil GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Developing Rapidly with Generative AI

nomic-ai GitHub - nomic-ai/gpt4all: gpt4all: open-source LLM chatbots that you can run anywhere