GitHub - okuvshynov/slowllama: Finetune llama2-70b and codel...

GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization

RelatedInsightsHighlights

We made a Guide to teach you how to Fine-tune LLMs correctly! Learn about: • Choosing the right parameters & training method • RL, GRPO, DPO & CPT • Data prep, Overfitting & Evaluation • Training with Unsloth & deploy on vLLM, Ollama, Open... See more

Unsloth AI

x.com

You can now optimize and make any open-source LLM faster: 1. pip install llmcompressor 2. apply quantization with 1 line of code Two benefits: 1. Your LLM will run faster during inference time. 2. You will save a ton of money on... See more

Santiago

x.com

GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks

mit-han-lab github.com