Workers AI: serverless GPU-powered inference on Cloudflare’s...

Workers AI: serverless GPU-powered inference on Cloudflare’s global network

Phil Wittig blog.cloudflare.com

RelatedHighlights

AI Revolution - Transformers and Large Language Models (LLMs)

Elad Gil blog.eladgil.com

Our container platform is in production. It has GPUs. Here’s an early look

Thomas Lefebvre blog.cloudflare.com

There's An AI For That - Discover The Newest And Best AI Tools (theresanaiforthat.com )

ArtificialAnalysis.ai

artificialanalysis.ai artificialanalysis.ai

Announcing Together Inference Engine – the fastest inference available

November 13, 2023・By Together

The Together Inference Engine is multiple times faster than any other inference service, with 117 tokens per second on Llama-2-70B-Chat and 171 tokens per second on Llama-2-13B-Chat

‍

Today we are announcing Together Inference Engine, the world’s fast... See more

Announcing Together Inference Engine – the fastest inference available

The explosion of generative AI made us pause and consider what was possible now that wasn’t a year ago. We tried many ideas which didn’t really click, eventually discovering the power of turning every feed and job posting into a springboard to:

Get information faster , e.g. takeaways from a post or learn about the latest from a company.

Connect the

Juan Pablo Bottaro • Musings on Building a Generative AI Product

The human-centric platform for production ML & AI

Access data easily, scale compute cost-efficiently, and ship to production confidently with fully managed infrastructure, running securely in your cloud.

Infrastructure for ML, AI, and Data Science | Outerbounds