Excerpt from a video I just posted on YouTube (about the multilayer perception layers in a transformer, and how LLMs may store facts). https://t.co/HDNWzX6AxX
Excellent explanation of RoPE embedding, from scratch with all the math needed: https://t.co/Y1LlEAJrtj
And with beautiful 3blue1brown's style of animation: https://t.co/9x8JdRWAww.
Original RoPE paper: https://t.co/3Qv0gN82H9... See more