LLMs
The quality of dataset is 95% of everything. The rest 5% is not to ruin it with bad parameters.
After 500+ LoRAs made, here is the secret
- Right now, GPTs are the easiest way of sharing structured prompts, which are programs, written in plain English (or another language), that can get the AI to do useful things. I discussed creating structured prompts last week, and all the same techniques apply, but the GPT system makes structured prompts more powerful and much easier to create,
Ethan Mollick • Almost an Agent: What GPTs can do
Announcing Together Inference Engine – the fastest inference available
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s... See more
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s... See more
Announcing Together Inference Engine – the fastest inference available
Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems
Matei Zaharia, Omar Khattab, Lingjiao Chen, et al. • The Shift From Models to Compound AI Systems
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference
Table of Contents
1. Introduction
Large... See more
Table of Contents
- Introduction
- Key LLM Serving Techniques
- Dynamic SplitFuse: A Novel Prompt and Generation Composition Strategy
- Performance Evaluation
- DeepSpeed-FastGen: Implementation and Usage
- Try out DeepSpeed-FastGen
- Acknowledgements
1. Introduction
Large... See more
microsoft • DeepSpeed-FastGen
Principles for growable tools
There are three critical pieces to building a tool that can grow around its users over time.
There are three critical pieces to building a tool that can grow around its users over time.
- Design around play . Sometimes I call this design around experimentation . Using the tool for day-to-day work should involve playing and experimenting with what’s possible with the tool. Whether that’s writing small programs to
Beyond customization: build tools that grow with us | thesephist.com
API wrappers, general-purpose AI tools and third-party AI tools for big platforms.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
API wrappers have a weak moat.
General AI tools try to be the jack-of-all-trades.
Big platforms will eat up small apps by adding similar AI features themselves.
Shortwave — rajhesh.panchanadhan@gmail.com [Gmail alternative]
- Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion of self-play will look like AI Feedback rather than competitive processes.