Introducing PlayHT 2.0 Turbo ⚡️ - The Fastest Generative AI Text-to-Speech API
Announcing Together Inference Engine – the fastest inference available
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s fast... See more
November 13, 2023・By Together
The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat
Today we are announcing Together Inference Engine, the world’s fast... See more
Announcing Together Inference Engine – the fastest inference available
Nicolay Gerold added
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference
Table of Contents
1. Introduction
Large langu... See more
Table of Contents
- Introduction
- Key LLM Serving Techniques
- Dynamic SplitFuse: A Novel Prompt and Generation Composition Strategy
- Performance Evaluation
- DeepSpeed-FastGen: Implementation and Usage
- Try out DeepSpeed-FastGen
- Acknowledgements
1. Introduction
Large langu... See more
microsoft • DeepSpeed-FastGen
Nicolay Gerold added
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its outp... See more
New models and developer products announced at DevDay
Nicolay Gerold added
Developers can now generate human-quality speech from text via the text-to-speech API. Our new TTS model offers six preset voices to choose from and two model variants, tts-1 and tts-1-hd . tts is optimized for real-time use cases and tts-1-hd is optimized for quality. Pricing starts at $0.015 per input 1,000 characters. Check out our TTS guide to ... See more
New models and developer products announced at DevDay
Nicolay Gerold added
Even though the underlying model is no different than the usual GPT-4o, the addition of voice has a lot of implications. A voice-powered tutor works very differently than one that communicates via typing, for example. It can also speak many other languages providing new approaches to cross-cultural communication. And I have no doubt people will hav... See more
Ethan Mollick • On speaking to AI
MargaretC added
Voice will take it to a new level and might make use much more widespread
4. Introducing Stable LM 3B: Bringing Sustainable, High-Performance Language Models to Smart Devices
Stability AI introduced Stable LM 3B, a high-performing language model designed for smart devices. With 3 billion parameters, it outperforms state-of-the-art 3B models and reduces operating costs and power consumption. The model enables a broader ran... See more
Stability AI introduced Stable LM 3B, a high-performing language model designed for smart devices. With 3 billion parameters, it outperforms state-of-the-art 3B models and reduces operating costs and power consumption. The model enables a broader ran... See more
This AI newsletter is all you need #68
Nicolay Gerold added
One major challenge is the inference time of the models. While models like ChatGPT have improved in speed, they still take time to process information. When dealing with a large number of agents, there can be significant latency in real-time interactions. Optimizations and fine-tuning will be necessary to make the models faster and more efficient. ... See more
Sandhya Hegde • Autonomous AI agents could change the world, but what do they actually do well?
Darren LI added
Actually, this is disputed, costs of API calls are crucial at this stage, due to simple, 1-sentence action roughly requires 40~75k tokens burn.
Imagine being able to have a language conversation about anything with a computer. This is now possible and available to many people for the first time with ChatGPT. In this episode we take a look at the consequences and some interesting insights from Open AI’s CEO Sam Altman.
ColdFusion • It’s Time to Pay Attention to A.I. (ChatGPT and Beyond)
Jason Shen added