The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.

The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.

Sarah Wang What Builders Talk About When They Talk About AI | Andreessen Horowitz

Nicolay Gerold added

Brian Christian The Alignment Problem

Sofia Quaglia How the brains of social animals synchronise and expand one another

Thumbnail of www-x-com-swyx-status-1793064538650325472Thumbnail of www-x-com-swyx-status-1793064538650325472Thumbnail of www-x-com-swyx-status-1793064538650325472Thumbnail of www-x-com-swyx-status-1793064538650325472

Noah Smith Generative AI: autocomplete for everything

sari added

future.com How Recommendation Algorithms Actually Work | Future - https://future.com/forget-open-source-algorithms-focus-on-experiments-instead

Tom So added

dailynous.com Philosophers on GPT-3 (