What can LLMs never do?
Ironically it can’t figure out where to focus its attention. Because the way attention is done currently is static and processes all parts of the sequence simultaneously, rather than using multiple heuristics to be more selective and to reset the context dynamically, to try counterfactuals.
This is because attention as it measures isn’t really a mul
... See more