updated 10mo ago
GitHub - unslothai/unsloth: 5X faster 50% less memory LLM finetuning
- slowllama
Fine-tune Llama2 and CodeLLama models, including 70B/35B on Apple M1/M2 devices (for example, Macbook Air or Mac Mini) or consumer nVidia GPUs.
slowllama is not using any quantization. Instead, it offloads parts of model to SSD or main memory on both forward/backward passes. In contrast with training large models from scratch (unattainable... See morefrom GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization by okuvshynov
Nicolay Gerold added
- Mistral-finetune
mistral-finetune is a light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.
For maximum efficiency it is recommended to use a A... See morefrom GitHub - mistralai/mistral-finetune
Nicolay Gerold added
- ⚡ LitGPT
Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data
Uses the latest state-of-the-art techniques:
✅ flash attention ✅ fp4/8/16/32 ✅ LoRA, QLoRA, Adapter (v1, v2) ✅ FSDP ✅ 1-1000+ GPUs/TPUs
Lightning AI • Models • Quick start • Inference • Finetune • Pretrain • Deploy • Features • Training recipes (YAML)
Finetune, pretrain and d... See morefrom GitHub - Lightning-AI/litgpt: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more. by Lightning-AI
Nicolay Gerold added