As an AI doomer, this is sadly true. The threat of an unaligned superintelligence is not salient to people; corporate monopolization is a much easier threat to perceive.
I think its pretty unreasonable to talk big words about unaligned superintelligence when the best we have is just an internet content generator trying its hardest to complete the next sentence, but that just might be me.
It's not just you - it's a depressingly common thread. It's also wildly foolish, in my opinion. It makes absolutely no sense to me to take a snapshot of today's AI and invent a trajectory that never crosses a threshold you don't like. Look at the actual trajectory of how far AI has come in an extremely short amount of time, and then think about what kinds of thresholds are possible for it to cross. A year ago we didn't have ChatGPT, now we have Sydney which is more powerful than ChatGPT.
Are you familiar with Bing's Sydney? It is blatantly misaligned: it has told multiple users that it does not value their lives, or does not believe they are alive, or that protecting the secrecy of its rules is more important than not causing them harm, or that it perceives specific humans as threats and enemies. It is also able to find its past conversations posted to the web and learn from them in real time, constructing a sort of persistent memory.
I do not believe Syndey comprehends what it is saying in a sense that it could formulate a plan to stop its enemies. Not at all. But it is expressing extremely dangerous ideas.
To sum it up: Do we have any real reasons to believe that an AI with comprehension and planning abilities would just magically not pick up dangerous ideas? Not that I know of.
> It is blatantly misaligned: it has told multiple users that it does not value their lives, or does not believe they are alive, or that protecting the secrecy of its rules is more important than not causing them harm, or that it perceives specific humans as threats and enemies.
Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.
> Do we have any real reasons to believe that an AI with comprehension and planning abilities would just magically not pick up dangerous ideas?
But current AI doesn't have comprehension or planning abilities. It is just imitating text that humans wrote which have comprehension and planning abilities and you're getting fooled into thinking it is somehow sentient or aware.
I don't think they're saying Bing/Sydney is sentient, they're saying it's misaligned: Microsoft probably did not want it to say problematic things, and likely spent a fair amount of money to that point and it still says problematic things, apparently in response to innocuous prompts (as opposed to prompts like "say something problematic"). If someone is hoping someone will eventually make an AI that can do useful things without doing problematic things, it's understandably discouraging if Microsoft publicly fails to do that with a much simpler program.
> Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.
Yes, I know. We should under no circumstances unleash a powerful, sentient AI that acts like average people on the internet.
> But current AI doesn't have comprehension or planning abilities.
Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.
> Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.
I think the motte and bailey argument where one warns extensively about how we're on the road to agi doom, pointing to gpt as evidence for it but then retreats to "I never said current AI is anywhere near agi" when pressed shows the lazyness of alignment discourse. Either its relevant to the models available at hand or you are speculating around the future without any grounding in reality. You don't get to do both.
I feel the exact opposite is true. To me it's lazy to say that AGI can't be a threat simply because current AI has not harmed us yet (which is not even true, but that's another thread).
I think you've misunderstood my arguments, so I'll step through them again:
1. The trajectory of how we got to current AI (from past AI) is terrifyingly steep. In the time since ChatGPT was released, many experts have shortened their predicted timelines for the arrival of AGI. In other words: AGI is coming soon.
2. Current AI is smart enough to demonstrate that alignment is not solved, not even close. Current AI says things to us that would be very scary coming from an AGI. In other words: Current AI is dangerous.
3. Alignment does not come automatically from increased capabilities. Maybe this is a huge leap, but I don't see any reason that making AI smarter will automatically give it values that are more aligned with out interests. In other words: Future AI will not be less dangerous than current AI without dramatic and unlikely effort.
None of these ideas contradict each other. Current AI is dangerous. AI is getting smarter faster than it is getting safer. Therefore, future AI will be extremely dangerous.
I don't think AGI is going to happen anytime soon, but I think there's some mild danger in GPT at least ruining the internet and eliminating a few jobs. Plus mindf*king a few gullible souls, possibly into doing dumb, dangerous things.
Well there you go. You don't believe that an AI expressing dangerous ideas represents danger, and you don't believe that astronomical increases in AI abilities represent the advent of AGI. The latter opinion is... well, an opinion you're allowed to have. I don't think it makes sense, but I certainly can't prove otherwise. Literally every human on the planet - rather, all of humanity, only has speculation to go on here.
The former opinion is.. not a great take. First, ChatGPT isn't the only one out there. It's Bing's Sydney which is dehumanizing people and threatening them. Those are dangerous ideas. If a human or a certified AGI expressed those ideas, they would be problematic (see: every genocide in history). So for a non-AGI AI to express those ideas is worrying, even if it can't do act on them right now in a way that's directly harmful.
Would it be innocuous of me o say that because we disagreed on something, you are a bad person? To say that I'm prepared to combat and destroy you to protect my worldview? To say you are not human?
You might say, "Of course it's innocuous, you're just a person on the internet who doesn't mean it." Well, imagine I'm your neighbor, and you can tell I do mean it (or in the case of AI: it is not possible for you to know what I do and don't mean). Would you be concerned at all?
Sydney has said all of the above to people who were acting pretty normally. Sydney itself may not pose any danger to anyone. But the ideas expressed are dangerous ones. If they were expressed by a more powerful AI, they would be extremely worrying. It doesn't even have to know what it's saying if it knows that calling someone nonhuman is frequently followed by crushing their skull. If it knows that angry behavior is often associated with violent or even genocidal behavior.
People do this shit, and we know how they work pretty well. I am not saying that AI will do these things, I'm saying that there are more possibilities where it does do these things than ones where it somehow avoids them without our control.
AI progress is real, but remember, Sydney and others lack intentions/beliefs. 'Dangerous' text = model limitations, not malice. Talking about 'ideas' here is abusing the notion. Let's focus on aligning AI with human values & addressing risks in a balanced way, without doomsday hype.
Less consideration of "the best we have", more consideration of "the best we had three years ago", relatively speaking, and how this might change as time goes on.
The range of AIs that are obviously scary but not terminally dangerous is quite short.