Yes, if we managed to program human values into this system that would essentially solve the problem. But the task is formidable. We don't really have a good definition of human values, we disagree about a lot of things, values change over time (a lot of values held in antiquity are disastrous from todays point of view) and we have yet to develop ways to put them into software. But I certainly hope that we overcome this challenge.Chrisw wrote:home_ wrote:This doesn't solve the problem. If we make an AI this way it wouldn't mean that it would respect human values. If AI wanted to achieve some goal (regardless of whether that goal was pre-programmed or an emergent property) it could still destroy humans because we'd be standing in a way, even if just temporary.Chrisw wrote:But we aren't born with concrete goals about what we want to achieve in the world. We choose our goals and can change them at will. We constantly evaluate our goals, not by reference to other goals but according to how they make us feel at a much more primitive level. And this includes moral feelings and emotions.
It would respect the values we programmed it to have. Values are more resilient than goals, or at least they are in humans and thus in creatures designed to resemble humans in this way.
Another way of dealing with the problem would be to develop advanced math that would ensure that AI system would stay within chosen constraints. Math doesn't change, 2+2=4 for chimps and for humans and it will also hold for AI.
Yes, if it has limited power then the problems are managable. But ultimately the point of making AI is to make strong AI, not weak AI. We have lots of weak and limited AI today.Chrisw wrote:It's OK if it has limited power. You get into problems when you start allowing it to modify itself. The idea that it could modify itself in accordance with its goal implies that its goal is somehow separate from what it is. But in that case why wouldn't it just change the goal to match reality?
And why would strong AI want to change its goal? If the goal is to be changed, then it is unlikely that it will be met. That would be disastrous from the point of view of that goal. Having a goal implies preservation of that goal.