updated 1y ago
🕳️ Attention Sinks in LLMs for endless fluency
- There is often a notable gap between state of the art research and what practitioners can reasonably use. However, I'm glad to say that attention sinks can be added to any pretrained LLM at near to no additional effort.
I have released the attention_sinks Python module, which acts as a drop-in replacement for the transformers API. This Python module... See morefrom 🕳️ Attention Sinks in LLMs for endless fluency by Tom Aarsen
Nicolay Gerold added 1y ago