AI Safety Atlas

"Aligned to whom?" remains a fundamental question with no consensus answer. Should AI systems align to the immediate operator (Christiano, 2018), the system designer (Gil, 2023), a specific group of humans, humanity as a whole (Miller, 2022), objective ethical principles, or the operator's hypothetical informed preferences? There are no agreed upon... See more