Large Language Models (LLMs)
by sari and · updated 4mo ago
Large Language Models (LLMs)
by sari and · updated 4mo ago
sari added 4mo ago
"the best use case of LLMs is bullshit"
sari added 4mo ago
1 highlight
sari added 4mo ago
sari added 4mo ago
Weird GPT token for Reddit user davidjl123, “a keen member of the /r/counting subreddit. He’s posted incremented numbers there well over 163,000 times. Presumably that subreddit ended up in the training data used to create the tokenizer used by GPT-2, and since that particular username showed up hundreds of thousands of times it ended up getting it
... See moreJohann Van Tonder added 5mo ago
Meta AI released LLaMA ... and they included a paper which described exactly what it was trained on. It was 5TB of data.
2/3 of it was from Common Crawl. It had content from GitHub, Wikipedia, ArXiv, StackExchange and something called “Books”.
What’s Books? 4.5% of the training data was books. Part of this was Project Gutenberg, which is public dom
Johann Van Tonder added 5mo ago
"A key challenge of (LLMs) is that they do not come with a manual! They come with a “Twitter influencer manual” instead, where lots of people online loudly boast about the things they can do with a very low accuracy rate, which is really frustrating..."
Simon Willison, attempting to explain LLM
Johann Van Tonder added 5mo ago
“A more practical answer is that it’s a file. This right here is a large language model, called Vicuna 7B. It’s a 4.2 gigabyte file on my computer. If you open the file, it’s just numbers. These things are giant binary blobs of numbers…”
Simon Willison, attempting to explain LLM
Johann Van Tonder added 5mo ago
One way to think about (LLM) is that about 3 years ago, aliens landed on Earth. They handed over a USB stick and then disappeared. Since then we’ve been poking the thing they gave us with a stick, trying to figure out what it does and how it works.
Johann Van Tonder added 5mo ago
“The fact that these things model language is probably one of the biggest discoveries in history. That you (LLM) can learn language by just predicting the next word … — that’s just shocking to me.”
- Mikhail Belkin, computer scientist at the University of California
Johann Van Tonder added 5mo ago