LLMs
Setting up the necessary machine learning infrastructure to run these big models is another challenge. We need a dedicated model server for running model inference (using frameworks like Triton oder vLLM), powerful GPUs to run everything robustly, and configurability in our servers to make sure they're high throughput and low latency. Tuning the... See more
Developing Rapidly with Generative AI
LLMTuner
LLMTuner: Fine-Tune Llama, Whisper, and other LLMs with best practices like LoRA, QLoRA, through a sleek, scikit-learn-inspired interface.
LLMTuner: Fine-Tune Llama, Whisper, and other LLMs with best practices like LoRA, QLoRA, through a sleek, scikit-learn-inspired interface.
promptslab • GitHub - promptslab/LLMtuner: Tune LLM in few lines of code
Humans are bad at coming up with search queries. Humans are good at incrementally narrowing down options with a series of filters, and pointing where they want to go next. This seems obvious, but we keep building interfaces for finding information that look more like Google Search and less like a map.
All information tools have to give users some... See more
All information tools have to give users some... See more
thesephist.com • Navigate, don't search
Overview
MaxText is a high performance , highly scalable , open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference . MaxText achieves high MFUs and scales from single host to very large clusters while staying simple and "optimization-free" thanks to the power of Jax and the XLA compiler.
MaxText... See more
MaxText is a high performance , highly scalable , open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference . MaxText achieves high MFUs and scales from single host to very large clusters while staying simple and "optimization-free" thanks to the power of Jax and the XLA compiler.
MaxText... See more
google • GitHub - google/maxtext: A simple, performant and scalable Jax LLM!
API wrappers, general-purpose AI tools and third-party AI tools for big platforms.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
How do models represent style, and how can we more precisely extract and steer it?
A commonly requested feature in almost any LLM-based writing application is “I want the AI to respond in my style of writing,” or “I want the AI to adhere to this style guide.” Aside from costly and complicated multi-stage finetuning processes like Anthropic’s RL with... See more
A commonly requested feature in almost any LLM-based writing application is “I want the AI to respond in my style of writing,” or “I want the AI to adhere to this style guide.” Aside from costly and complicated multi-stage finetuning processes like Anthropic’s RL with... See more
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
Unlike consumers, enterprises want control over how their data is used and shared with companies, including the providers of AI software. Enterprises have spent a lot effort in consolidating data from different sources and bringing them in-house (this article Partner integrations + System of Intelligence: Today’s deepest Moat by fellow Medium... See more
AI Startup Trends: Insights from Y Combinator’s Latest Batch
𝗺𝗲𝘁𝗵𝗼𝗱𝘀 𝗼𝗳 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗶𝗻𝗴 𝗮𝗻 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲 𝗟𝗟𝗠 𝗲𝘅𝗶𝘀t ↓
- 𝘊𝘰𝘯𝘵𝘪𝘯𝘶𝘦𝘥 𝘱𝘳𝘦-𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨: utilize domain-specific data to apply the same pre-training process (next token prediction) on the pre-trained (base) model
- 𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵𝘪𝘰𝘯 𝘧𝘪𝘯𝘦-𝘵𝘶𝘯𝘪𝘯𝘨: the pre-trained (base) model is fine-tuned on a Q&A dataset to learn to answer questions
- 𝘚𝘪𝘯𝘨𝘭𝘦-𝘵𝘢𝘴𝘬 𝘧𝘪𝘯𝘦-𝘵𝘶𝘯𝘪𝘯𝘨: the... See more
- 𝘊𝘰𝘯𝘵𝘪𝘯𝘶𝘦𝘥 𝘱𝘳𝘦-𝘵𝘳𝘢𝘪𝘯𝘪𝘯𝘨: utilize domain-specific data to apply the same pre-training process (next token prediction) on the pre-trained (base) model
- 𝘐𝘯𝘴𝘵𝘳𝘶𝘤𝘵𝘪𝘰𝘯 𝘧𝘪𝘯𝘦-𝘵𝘶𝘯𝘪𝘯𝘨: the pre-trained (base) model is fine-tuned on a Q&A dataset to learn to answer questions
- 𝘚𝘪𝘯𝘨𝘭𝘦-𝘵𝘢𝘴𝘬 𝘧𝘪𝘯𝘦-𝘵𝘶𝘯𝘪𝘯𝘨: the... See more
.png?table=block&id=5cffd615-f82a-4e84-b2ff-4f4e496e2d3e&spaceId=996f2b3b-deaa-4214-aedb-cbc914a1833e&width=1330&userId=&cache=v2)