Attention Is All You Need
Self-Attention at a High Level
Don’t be fooled by me throwing around the word “self-attention” like it’s a concept everyone should be familiar with. I had personally never came across the concept until reading the Attention is All You Need paper. Let us distill how it works.
Say the following sentence is an input sentence we want to translate:
” The ... See more
Don’t be fooled by me throwing around the word “self-attention” like it’s a concept everyone should be familiar with. I had personally never came across the concept until reading the Attention is All You Need paper. Let us distill how it works.
Say the following sentence is an input sentence we want to translate:
” The ... See more
Jay Alammar • The Illustrated Transformer
Luc Cheung added
Andrés and added
RP and added
The first feature is strong meta-attention (attention of attention).
Chade-Meng Tan • Search Inside Yourself: Increase Productivity, Creativity and Happiness [ePub edition]
“The group said, Hey, we have these convolutional networks. They’ve been phenomenal at doing image classification. Um, what if we replace your feature-construction mechanism, which is still a bit of a kludge, by just a convolutional neural network?”
Brian Christian • The Alignment Problem
attention is a way of prioritizing and tuning sensory data
The Battle for Attention
Erica Morton Magill and added
DeepSeekV2 is a big deal. Not only because its significant improvements to both key components of Transformer: the Attention layer and FFN layer.
It has also completed disrupted the Chines LLM market and forcing the competitors to drop the price to 1% of the original price.
⬇️ https://t.co/eDNeRHAzTp
Nathan Storey added