
The Alignment Problem

The more principled path forward, Samuel reasoned, was for the computer itself to somehow generate strategic considerations on its own.
Brian Christian • The Alignment Problem
With machine imitators, too, we would do well to keep the theory of the second best in mind.
Brian Christian • The Alignment Problem
Nvidia mounted three cameras on a car, with one pointed forward and the others pointed roughly thirty degrees left and right of center. This generated hours and hours of footage of what it would look like if a car were pointed slightly in the wrong direction.
Brian Christian • The Alignment Problem
“actor-critic” architecture, where the “actor” half of the system would learn to take good actions, and the “critic” half would learn to predict future rewards.
Brian Christian • The Alignment Problem
Promisingly, he showed that Q-learning would always “converge,” namely, as long as the system had the opportunity to try every action, from every state, as many times as necessary, it would always, eventually develop the perfect value function:
Brian Christian • The Alignment Problem
deeper challenge is that both traditional reward-based reinforcement learning and imitation-learning techniques require humans to act as sources of ultimate authority.
Brian Christian • The Alignment Problem
you had to also subtract progress away from the goal.
Brian Christian • The Alignment Problem
What if they aren’t trying to do anything whatsoever, and their actions reflect random behavior, nothing more?
Brian Christian • The Alignment Problem
There are a great many things in life that are very difficult to perform, but comparatively easy to evaluate.