GitHub - michaelfeil/infinity: Infinity is a high-throughput...

GitHub - michaelfeil/infinity: Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of sentence-transformer models and frameworks.

michaelfeil github.com

RelatedHighlights

⚡ LitGPT

Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data

Uses the latest state-of-the-art techniques:

✅ flash attention ✅ fp4/8/16/32 ✅ LoRA, QLoRA, Adapter (v1, v2) ✅ FSDP ✅ 1-1000+ GPUs/TPUs

Lightning AI • Models • Quick start • Inference • Finetune • Pretrain • Deploy • Features • Training recipes (YAML)

Finetune, pretrain and d... See more

Lightning-AI • GitHub - Lightning-AI/litgpt: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

Embeddings: What they are and why they matter

simonwillison.net

Vector databases are widely used in NLP tasks such as sentiment analysis, text classification, and semantic search. By representing text as vector embeddings, it becomes easier to compare and analyze textual data.

Lightning-AI • GitHub - Lightning-AI/litgpt: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

Embeddings: What they are and why they matter

Ben Auffarth • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs