
Understanding LLMs from Scratch Using Middle School Math

LLMs are extremely good at a type of knowledge called tacit knowledge (knowledge about something as it pertains to something else) but are extremely poor at pretty much any other type as far as I can see. It just so happens that tacit knowledge also happens to be the type of knowledge that drives natural language so it makes them looks super smart,... See more
Column: These Apple researchers just showed that AI bots can't think, and possibly never will — Apple’s AI researchers gave these AI systems a simple arithmetic problem that schoolkids can solve. The bots flunked.
a couple of the top of my head:
- LLM in the loop with preference optimization
- synthetic data generation
- cross modality "distillation" / dictionary remapping
- constrained decoding
r/MachineLearning - Reddit
Because if you strip LLMs to their essence, they are just a much better way of using statistics to aggregate human intelligence and connect everything we’ve all done together to get more use out of it.
Sari Azout • Becoming unLLMable
LLMs absorb superhuman quantities of information at training time.
Timothy B. Lee • Why large language models struggle with long contexts
One way to think about (LLM) is that about 3 years ago, aliens landed on Earth. They handed over a USB stick and then disappeared. Since then we’ve been poking the thing they gave us with a stick, trying to figure out what it does and how it works.