Large Language Models (LLMs)

sari and

Language models can only write poetry

Allison Parish posts.decontextualize.com

Isabelle Levent

Talking about large language models

arxiv.org

sari

Large language models, explained with a minimum of math and jargon

Sean Trott understandingai.org

Darren LI and

“A more practical answer is that it’s a file. This right here is a large language model, called Vicuna 7B. It’s a 4.2 gigabyte file on my computer. If you open the file, it’s just numbers. These things are giant binary blobs of numbers…”
Simon Willison, attempting to explain LLM

Johann Van Tonder

An Interview With Daniel Gross and Nat Friedman about the Democratization of AI

Daniel Gross stratechery.com

sari

Meta AI released LLaMA ... and they included a paper which described exactly what it was trained on. It was 5TB of data.

2/3 of it was from Common Crawl. It had content from GitHub, Wikipedia, ArXiv, StackExchange and something called “Books”.

What’s Books? 4.5% of the training data was books. Part of this was Project Gutenberg, which is public dom

Johann Van Tonder

There’s a lot of hype around AI, and in particular, Large Language Models (LLMs). To be blunt, a lot of that hype is just some demo bullshit that would fall over the instant anyone tried to use it for a real task that their job depends on. The reality is far less glamorous: it’s hard to build a real product backed by an LLM .

Phillip Carter • All the Hard Stuff Nobody Talks About When Building Products With LLMs

sari

Using LLM products today feels a lot like using early cars in the 1800s: clearly magical, clearly going to change the world, and really hard to drive.

The first cars didn’t have steering wheels (they hadn’t been invented yet), so you’d steer them with a big lever called a tiller. The problem with tillers is that they are imprecise, which made drivin... See more

sari

"A key challenge of (LLMs) is that they do not come with a manual! They come with a “Twitter influencer manual” instead, where lots of people online loudly boast about the things they can do with a very low accuracy rate, which is really frustrating..."

Simon Willison, attempting to explain LLM

Johann Van Tonder

ChatGPT Is a Blurry JPEG of the Web

Ted Chiang newyorker.com

sari