LLMs
๐ฑ๐ฆ๐ณ๐ง๐ฐ๐ณ๐ฎ๐ข๐ฏ๐ค๐ฆ: it will improve your LLM performance on given use cases (e.g., coding, extracting text, etc.). Mainly, the LLM will specialize in a given task (a specialist will always beat a generalist in its domain)
๐ค๐ฐ๐ฏ๐ต๐ณ๐ฐ๐ญ: you can refine how your model should behave on specific inputs and outputs, resulting in a more robust product
๐ฎ๐ฐ๐ฅ๐ถ๐ญ๐ข๐ณ๐ช๐ป๐ข๐ต๐ช๐ฐ๐ฏ:... See more
๐ค๐ฐ๐ฏ๐ต๐ณ๐ฐ๐ญ: you can refine how your model should behave on specific inputs and outputs, resulting in a more robust product
๐ฎ๐ฐ๐ฅ๐ถ๐ญ๐ข๐ณ๐ช๐ป๐ข๐ต๐ช๐ฐ๐ฏ:... See more
Shortwave โ rajhesh.panchanadhan@gmail.com [Gmail alternative]
Motivation for finetuning
Memory Considerations
Since co-occurrence matrices are square, they grow exponential with the number of entities being embedded. For 50k entities and a 32-bit data format, a dense matrix will already be at 10GB. 100k entities puts it at 40GB.
If you are trying to embed even more entities than that or have limited RAM available, you may need to use a... See more
Since co-occurrence matrices are square, they grow exponential with the number of entities being embedded. For 50k entities and a 32-bit data format, a dense matrix will already be at 10GB. 100k entities puts it at 40GB.
If you are trying to embed even more entities than that or have limited RAM available, you may need to use a... See more
What I've Learned Building Interactive Embedding Visualizations
Top considerations when choosing foundation models
Accuracy
Cost
Latency
Privacy
Top challenges when deploying production AI
Serving cost
Evaluation
Infra reliability
Model quality
Accuracy
Cost
Latency
Privacy
Top challenges when deploying production AI
Serving cost
Evaluation
Infra reliability
Model quality
Notion โ The all-in-one workspace for your notes, tasks, wikis, and databases.
How do models represent style, and how can we more precisely extract and steer it?
A commonly requested feature in almost any LLM-based writing application is โI want the AI to respond in my style of writing,โ or โI want the AI to adhere to this style guide.โ Aside from costly and complicated multi-stage finetuning processes like Anthropicโs RL with... See more
A commonly requested feature in almost any LLM-based writing application is โI want the AI to respond in my style of writing,โ or โI want the AI to adhere to this style guide.โ Aside from costly and complicated multi-stage finetuning processes like Anthropicโs RL with... See more
Shortwave โ rajhesh.panchanadhan@gmail.com [Gmail alternative]
The OpenAI Assistants API offers more than a simple prompt-sharing interface; it provides a sophisticated framework for AI interactions. It allows for persistent conversation sessions with automatic context management (Threads), structured interactions (Messages and Runs), integration with various tools for enhanced capabilities, customization... See more
Discord - A New Way to Chat with Friends & Communities
โก LitGPT
Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data
Uses the latest state-of-the-art techniques:
โ flash attention โ fp4/8/16/32 โ LoRA, QLoRA, Adapter (v1, v2) โ FSDP โ 1-1000+ GPUs/TPUs
Lightning AI โข Models โข Quick start โข Inference โข Finetune โข Pretrain โข Deploy โข Features โข Training recipes (YAML)
Finetune, pretrain and... See more
Pretrain, finetune, evaluate, and deploy 20+ LLMs on your own data
Uses the latest state-of-the-art techniques:
โ flash attention โ fp4/8/16/32 โ LoRA, QLoRA, Adapter (v1, v2) โ FSDP โ 1-1000+ GPUs/TPUs
Lightning AI โข Models โข Quick start โข Inference โข Finetune โข Pretrain โข Deploy โข Features โข Training recipes (YAML)
Finetune, pretrain and... See more
Lightning-AI โข GitHub - Lightning-AI/litgpt: Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
How can we make interacting with conversational models feel more natural?
Every conversational interface to a language model adopts the same pattern:
A chat history sidebar, with each conversation lasting just a few turns
New sessions always begin in a brand-new thread
Every user query must always elicit exactly one response
None of these assumptions... See more
Every conversational interface to a language model adopts the same pattern:
A chat history sidebar, with each conversation lasting just a few turns
New sessions always begin in a brand-new thread
Every user query must always elicit exactly one response
None of these assumptions... See more
Shortwave โ rajhesh.panchanadhan@gmail.com [Gmail alternative]
Overview
Loki is our open-source solution designed to automate the process of verifying factuality. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is... See more
Loki is our open-source solution designed to automate the process of verifying factuality. It provides a comprehensive pipeline for dissecting long texts into individual claims, assessing their worthiness for verification, generating queries for evidence search, crawling for evidence, and ultimately verifying the claims. This tool is... See more