GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

by turboderp

Thumbnail of GitHub - turboderp/exllamav2: A fast inference library for running LLMs locally on modern consumer-class GPUs

updated 10mo ago