AI Safety

AI Safety Atlas

which_humans_09222023.pdf

LLM outputs are compared with “human” performance, but which “humans”? Current LLMs are closer to western, educated, industrialised rich and democratic societies, but not resembling other populations.

Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models

arxiv.org

Empire of AI by Karen Hao Book Summary

summrize.com

AI Lab Watch

ailabwatch.org

Scored risk assessment of AI companies

AINews | AINews

news.smol.ai

AI news summarised daily

Build an illustrated “map” of the AI safety territory - philosophical, technical and economic considerations