Introducing PlayHT 2.0 Turbo ⚡️ - The Fastest Generative AI ...

Introducing PlayHT 2.0 Turbo ⚡️ - The Fastest Generative AI Text-to-Speech API

news.play.ht

RelatedHighlights

Announcing Together Inference Engine – the fastest inference available

November 13, 2023・By Together

The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat

‍

Today we are announcing Together Inference Engine, the world’s fast... See more

Announcing Together Inference Engine – the fastest inference available

Nicolay Gerold added

DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference

Table of Contents

Introduction

Key LLM Serving Techniques

Dynamic SplitFuse: A Novel Prompt and Generation Composition Strategy

Performance Evaluation

DeepSpeed-FastGen: Implementation and Usage

Try out DeepSpeed-FastGen

Acknowledgements

1. Introduction

Large langu... See more

microsoft • DeepSpeed-FastGen

Nicolay Gerold added

GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode, which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its outp... See more

New models and developer products announced at DevDay

Nicolay Gerold added

Developers can now generate human-quality speech from text via the text-to-speech API. Our new TTS model offers six preset voices to choose from and two model variants, tts-1 and tts-1-hd . tts is optimized for real-time use cases and tts-1-hd is optimized for quality. Pricing starts at $0.015 per input 1,000 characters. Check out our TTS guide to ... See more

New models and developer products announced at DevDay

Nicolay Gerold added

Even though the underlying model is no different than the usual GPT-4o, the addition of voice has a lot of implications. A voice-powered tutor works very differently than one that communicates via typing, for example. It can also speak many other languages providing new approaches to cross-cultural communication. And I have no doubt people will hav... See more

Ethan Mollick • On speaking to AI

MargaretC added

Voice will take it to a new level and might make use much more widespread

4. Introducing Stable LM 3B: Bringing Sustainable, High-Performance Language Models to Smart Devices

Stability AI introduced Stable LM 3B, a high-performing language model designed for smart devices. With 3 billion parameters, it outperforms state-of-the-art 3B models and reduces operating costs and power consumption. The model enables a broader ran... See more

This AI newsletter is all you need #68

Nicolay Gerold added

One major challenge is the inference time of the models. While models like ChatGPT have improved in speed, they still take time to process information. When dealing with a large number of agents, there can be significant latency in real-time interactions. Optimizations and fine-tuning will be necessary to make the models faster and more efficient. ... See more

Sandhya Hegde • Autonomous AI agents could change the world, but what do they actually do well?

Darren LI added

Actually, this is disputed, costs of API calls are crucial at this stage, due to simple, 1-sentence action roughly requires 40~75k tokens burn.

Imagine being able to have a language conversation about anything with a computer. This is now possible and available to many people for the first time with ChatGPT. In this episode we take a look at the consequences and some interesting insights from Open AI’s CEO Sam Altman.

ColdFusion • It’s Time to Pay Attention to A.I. (ChatGPT and Beyond)

Jason Shen added