GitHub - okuvshynov/slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization
Mistral-finetune
mistral-finetune is a light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.
For maximum efficiency it is recommended to use a... See more
mistral-finetune is a light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.
For maximum efficiency it is recommended to use a... See more
GitHub - mistralai/mistral-finetune
ExLlamaV2
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
Overview of differences compared to V1
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
Overview of differences compared to V1
- Faster, better kernels
- Cleaner and more versatile codebase
- Support for a new quant format (see below)
turboderp • GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs
Fine-Tuning for LLM Research by AI Hero
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more
This repo contains the code that will be run inside the container. Alternatively, this code can also be run natively. The container is built and pushed to the repo using Github actions (see below). You can launch the fine tuning job using the examples in the https://github.com/ai-hero/llm-research-examples... See more