Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

At the end of the day, we still don't know what exactly happened and probably, never will. However, it seems clear there was a rift between Rapid Commercialization (Team Sam) and Upholding the Original Principles (Team Helen/Ilya). I think the tensions were brewing for quite a while, as it's evident from an article written even before GPT-3 [1].

> Over time, it has allowed a fierce competitiveness and mounting pressure for ever more funding to erode its founding ideals of transparency, openness, and collaboration

Team Helen acted in panic, but they believed they would win since they were upholding the principles the org was founded on. But they never had a chance. I think only a minority of the general public truly cares about AI Safety, the rest are happy seeing ChatGPT helping with their homework. I know it's easy to ridicule the sheer stupidity the board acted with (and justifiably so), but take a moment to think of the other side. If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

Honestly, I myself can't take the threat seriously. But, I do want to understand it more deeply than before. Maybe, it isn't without substance as I thought it to be. Hopefully, there won't be a day when Team Helen gets to say, "This is exactly what we wanted to prevent."

[1]: https://www.technologyreview.com/2020/02/17/844721/ai-openai...



What the general public thinks is irrelevant here. The deciding factor was the staff mutiny, without which the organization is an empty shell. And the staff sided with those who aim for rapid real world impact, with directly affects their career and stock options etc.

It's also naive to think it was a struggle for principles. The rapid commercialization vs. principles is what the actors claim to rally their respective troops, in reality it was probably a naked power grab, taking advantage of the weak and confuse org structure. Quite an ill prepared move, the "correct" way to oust Altman was to hamstring him in the board and enforce a more and more ceremonial role until he would have quit by himself.


> deciding factor was the staff mutiny

The staff never mutinied. They threatened to mutiny. That's a big difference!

Yesterday, I compared these rebels to Shockley's "traitorous eight" [1]. But the traitorous eight actually rebelled. These folk put their name on a piece of paper, options and profit participation units safely held in the other hand.

[1] https://news.ycombinator.com/item?id=38348123


Not only that, consider the situation now, where Sam has returned as CEO. The ones who didn't sign will have some explaining to do.

The safest option was to sign the paper, once the snowball started rolling. There was nothing much to lose, and a lot to gain.


People have families, mortgages, debt, etc. Sure, these people are probably well compensated, but it is ludicrous to state that everyone has the stability that they can leave their job at a moment's notice because the boss is gone.


Didn’t they all have offers at Microsoft?


I think not at the time they would have signed the letter? Though it's hard to keep up with the whirlwind of news.


They didn't actually leave, they just signed the pledge threatening to. Furthermore, they mostly signed after the details of the Microsoft offer were revealed.


I think you are downplaying the risk they took significantly, this could have easily gone the other way.

Stock options usually have a limited time window to exercise, depending on their strike price they could have been faced with raising a few hundred thousand in 30 days, to put into a company that has an uncertain future, or risk losing everything. The contracts are likely full of holes not in favor of the employees, and for participating in an action that attempted to bankrupt their employer there would have been years of litigation ahead before they would have seen any cent. Not because OpenAI would have been right to punish them, but because it could and the latent threat to do it is what keeps people in line.


The board did it wrong. If you are going to fire a CEO, then do it quickly, but:

1. Have some explanation

2. Have a new CEO who is willing and able to do the job

If you can't do these things, then you probably shouldn't be firing the CEO.


Or (3), shut down the company. OpenAI's non-profit board had this power! They weren't an advisory committee, they were the legal and rightful owner of its for-profit subsidiary. They had the right to do what they wanted, and people forgetting to put a fucking quorum requirement into the bylaws is beyond abysmal for a $10+ billion investment.

Nobody comes out of this looking good. Nobody. If the board thought there was existential risk, they should have been willing to commit to it. Hopefully sensible start-ups can lure people away from their PPUs, now evident for the mockery they always were. It's beyond obvious this isn't, and will never be, a trillion dollar company. That's the only hope this $80+ billion Betamax valuation rested on.

I'm all for a comedy. But this was a waste of everyones' time. At least they could have done it in private.


It's the same thing, really. Even if you want to shut down the company you need a CEO to shut it down! Like John Ray who is shutting down FTX.

There isn't just a big red button that says "destroy company" in the basement. There will be partnerships to handle, severance, facilities, legal issues, maybe lawsuits, at the very least a lot of people to communicate with. Companies don't just shut themselves down, at least not multi billion dollar companies.


You’re right. But in an emergency, there is a close option which is to put the company into receivership and hire an outside law firm to advise. At that point, the board becomes the executive council.


I think this is an oversimplification and that although the decel faction definitely lost, there are still three independent factions left standing:

https://news.ycombinator.com/edit?id=38375767

It will be super interesting to see the subtle struggles for influence between these three.


Adam is likely still on the "decel" faction (although it's unclear whether this is an accurate representation of his beliefs) so I wouldn't really say they lost yet.

I'm not sure what faction Bret and Larry will be on. Sam will still have power by virtue of being CEO and aligned with the employees.


> If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

No, if OpenAI is reaching singularity, so are Google, Meta, and Baidu etc. so proper course of action would be to loop in NSA/White House. You'll loop in Google, Meta, MSFT and will start mitigation steps. Slowing down OpenAI will hurt the company if assumption is wrong and won't help if it is true.

I believe this is more a fight of ego and power than principles and direction.


> so proper course of action would be to loop in NSA/White House

Eh? That would be an awful idea. They have no expertise on this and government institutions like thus are misaligned with the rest of humanity by design. E.g. NSA recruits patriots and has many systems, procedures and cultural aspects in place to ensure it keeps up its mission of spying on everyone.


And Google, Facebook, MSFT, Apple, are much more misaligned.


>Slowing down OpenAI will hurt the company if assumption is wrong and won't help if it is true.

Personally as I watched the nukes be lobbed I'd rather not be the person who helped lob them. And hope to god others look at the same problem (a misaligned AI that is making insane decisions) with the exact same lens. It seems to have worked for nuclear weapons since WW2, one can that we learned a lesson there as a species.

The Russian Stanislav Petrov who saved the world comes to mind."Well the Americans have done it anyways" was the motivation and he didn't launch. The cost of error was simply too great.


This is a coherent narrative, but it doesn't explain the bizarre and aggressively worded initial press release.

Things perhaps could've been different if they'd pointed to the founding principles / charter and said the board had an intractable difference of opinion with Sam over their interpretation, but then proceeded to thank him profusely for all the work he'd done. Although a suitable replacement CEO out the gate and assurances that employees' PPUs would still see a liquidity event would doubtless have been even more important than a competent statement.

Initially I thought for sure Sam had done something criminal, that's how bad the statement was.


Apparently the FBI thought he'd done something wrong too, because they called up the board to start an investigation but they didn't have anything.

https://x.com/nivi/status/1727152963695808865?s=46


The FBI doesn't investigate things like this on their own, and they definitely do not announce them in the press. The questions you should be asking are (1) who called in the FBI and has the clout to get them to open an investigation into something that obviously has 0% chance of being a federal felony-level crime worth the FBI's time, and (2) who then leaked that 'investigation' to the press?


Sorry, the SDNY. They do do things on their own. I expect the people they called leaked it.


The FBI is not mentioned in that tweet. We don't need to telephone game anonymous leaks that are already almost certainly self-serving propaganda.


For all the talk about responsible progress, the irony of their inability to align even their own incentives in this enterprise deserves ridicule. It's a big blow to their credibility and questions whatever ethical concerns they hold.


It's fear driven as much as moral, which in an emotional humans brain tends to triggers personal ambition to solve it ASAP. A more rational one would realize you need more than just a couple board members to win a major ideological battle.

At a minimum something that doesn't immediately result in a backlash where 90% of the engineers most responsible for recent AI dev want you gone, when you're whole plan is to control what those people do.


Alignment is considered an extremely hard problem for a reason. It's already nigh impossible when you're dealing with humans.

Btw: do you think ridicule eould be helpful here?


I can see how ridicule of this specific instance could be the best medicine for an optimal outcome, even by a utilitarian argument, which I generally don't like to make by the way. It is indeed nigh impossible, which is kind of my point. They could have shown more humility. If anything, this whole debacle has been a moral victory for e/acc, seeing how the brightest of minds are at a loss dealing with alignment anyway.


I don't understand how the conclusion of this is "so we should proceed with AI" rather than "so we should immediately outlaw all foundation model training". Clearly corporate self-governance has failed completely.


Ok, serious question. If you think the threat is real, how are we not already screwed?

OpenAI is one of half a dozen teams [0] actively working on this problem, all funded by large public companies with lots of money and lots of talent. They made unique contributions, sure. But they're not that far ahead. If they stumble, surely one of the others will take the lead. Or maybe they will anyway, because who's to say where the next major innovation will come from?

So what I don't get about these reactions (allegedly from the board, and expressed here) is, if you interpret the threat as a real one, why are you acting like OpenAI has some infallible lead? This is not an excuse to govern OpenAI poorly, but let's be honest: if the company slows down the most likely outcome by far is that they'll cede the lead to someone else.

[0]: To be clear, there are definitely more. Those are just the large and public teams with existing products within some reasonable margin of OpenAI's quality.


> If you think the threat is real, how are we not already screwed?

That's the current Yudkowsky view. That it's essentially impossible at this point and we're doomed, but we might as well try anyway as its more "dignified" to die trying.

I'm a bit more optimistic myself.


I don't know. I think being realistic, only OpenAI and Google have the depth and breadth of expertise to develop general AI.

Most of the new AI startups are one trick ponies obsessively focused on LLM's. LLM's are only one piece of the puzzle.


Anthropic is made up of former top OpenAI employees, has similar funding, and has produced similarly capable models on a similar timeline. The Claude series is neck and neck with GPT.


I would add Meta to this list, in particular because Yann LeCun is the most vocal critic of LLM one-ponyism.


The risk/scenario of singularity is that there will be just one winner and they will be able to prevent everyone else from building their own agi


I feel like the "safety" crowd lost the PR battle, in part, because of framing it as "safety" and over-emphasizing on existential risk. Like you say, not that many people truly take that seriously right now.

But even if those types of problems don't surface anytime soon, this wave of AI is almost certainly going to be a powerful, society-altering technology; potentially more powerful than any in decades. We've all seen what can happen when powerful tech is put in the hands of companies and a culture whose only incentives are growth, revenue, and valuation -- the results can be not great. And I'm pretty sure a lot of the general public (and open AI staff) care about THAT.

For me, the safety/existential stuff is just one facet of the general problem of trying to align tech companies + their technology with humanity-at-large better than we have been recently. And that's especially important for landscape-altering tech like AI, even if it's not literally existential (although it may be).


No one who wants to capitalize on AI appears to take it seriously. Especially how grey that safety is. I'm not concerned AI is going to nuke humanity, I'm more concerned it'll re-enforce racism, bias, and the rest of human's irrational activities because it's _blindly_ using existing history to predict future.

We've seen it in the past decade in multiple cases. That's safety.

The decision that the topic discusses means Business is winning, and they absolutely will re-enforce the idea that the only care is that these systems allow them to re-enforce the business cases.

That's bad, and unsafe.


> Like you say, not that many people truly take that seriously right now.

Eh? Polls on the matter show widespread public support for a pause due to safety concerns.


> I think only a minority of the general public truly cares about AI Safety, the rest are happy seeing ChatGPT helping with their homework

Not just the public, but also the employees. I doubt there are more than a handful of employees who care about AI Safety.


the team is mostly e/acc

so you could say they intentionally don't see safety as the end in itself, although I wouldn't quite say they don't care.


Nah, a number do, including Sam himself and the entire leadership.

They just have different ideas about one or more of: how likely another team is to successfully charge ahead while ignoring safety, how close we are to AGI, how hard alignment is.


One funny thing about this mess is that "Team Helen" has never mentioned anything about safety, and Emmett said "The board did not remove Sam over any specific disagreement on safety".

The reason everyone thinks it's about safety seems largely because a lot of e/acc people on Twitter keep bringing it up as a strawman.

Of course, it might end up that it really was about safety in the end, but for now I still haven't seen any evidence. The story about Sam trying to get board control and the board retaliating seems more plausible given what's actually happened.


>The story about Sam trying to get board control and the board retaliating seems more plausible given what's actually happened.

What story? Any link?


I am still a bit puzzled that it is so easy to turn a non-profit into a for profit company. I am sure everything they did is legal, but it feels like it shouldn't be. Could Médecins Sans Frontières take in donations and then take that money to start a for profit hospital for plastics surgery? And the profits wouldn't even go back to MSF, but instead somehow private investors will get the profits. The whole construct just seems wrong.


I think it actually isn't that easy. Compared to your example, the difference is that OpenAI's for-profit is getting outside money from Microsoft, not money from non-profit OpenAI. Non-profit OpenAI is basically dealing with for-profit OpenAI as a external partner that happens to be aligned with their interests, paying the expensive bills and compute, while the non-profit can hold on to the IP.

You might be able to imagine a world where there was an external company that did the same thing as for-profit OpenAI, and OpenAI nonprofit partnered with them in order to get their AI ideas implemented (for free). OpenAI nonprofit is basically getting a good deal.

MSF could similarly create an external for-profit hospital, funded by external investors. The important thing is that the nonprofit (donated, tax-free) money doesn't flow into the forprofit section.

Of course, there's a lot of sketchiness in practice, which we can see in this situation with Microsoft influencing the direction of nonprofit OpenAI even though it shouldn't be. I think there would have been real legal issues if the Microsoft deal had continued.


> The important thing is that the nonprofit (donated, tax-free) money doesn't flow into the forprofit section.

I am sure that is true. But the for-profit uses IP that was developed inside of the non-profit with (presumably) tax deductible donations. That IP should be valued somehow. But, as I said, I am sure they were somehow able to structure it in a way that is legal, but it has an illegal feel to it.


Well, if it aligned with their goals, sure I think.

Let's make the situation a little different. Could MSF pay a private surgery with investors to perform reconstruction for someone?

Could they pay the surgery to perform some amount of work they deem aligns with their charter?

Could they invest in the surgery under the condition that they have some control over the practices there? (Edit - e.g. perform Y surgeries, only perform from a set of reconstructive ones, patients need to be approved as in need by a board, etc)

Raising private investment allows a non profit to shift cost and risk to other entities.

The problem really only comes when the structure doesn't align with the intended goals - which is something distinct to the structure, just something non profits can do.


The non-profit wasn't raising private investment.


Nothing I've said suggests that or requires that.


Apologies, I mistook this:

"Raising private investment allows a non profit to shift cost and risk to other entities."

for a suggestion of that.


Not sure if you're asking a serious question about MSF but it's interesting anyways - when these types of orgs are fundraising for a specific campaign, say Darfur, then they can NOT use that money for any other campaign, say for ex Turkey earthquake.

That's why they'll sometimes tell you to stop donating. That's here in EU at least (source is a relative who volunteers for such an org).


Not sure what your point is, but you can make a donation to MSF that is not tied to any specific cause.


> it seems clear there was a rift between Rapid Commercialization (Team Sam) and Upholding the Original Principles (Team Helen/Ilya)

Is it? Why was the press release worded like that? And why did Ilya came up with two mysterious reasons of why board fired Sam if he had quite clearly better and more defendable reason if this goes to court. Also Adam is pro commercialization at least looking at public interviews, no?

It's very easy to make the story in brain which involves one character being greedy, but it doesn't seem it is the exact case here.


> If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

In the 1990s and the 00s, it was no too uncommon for anti-GMO environmental activist / ecoterrorist groups to firebomb research facilities and to enter farms and fields to destroy planted GMO plants. Earth Liberation Front was only one of such activist groups [1].

We have yet to see even one bombing of an AI research lab. If people really are afraid of AIs, at least they do so more in the abstract and are not employing the tactics of more traditional activist movements.

[1] https://en.wikipedia.org/wiki/Earth_Liberation_Front#Notable...


It's mostly that it's a can of worms no one wants to open. Very much a last resort as its very tricky to use uncoordinated violence effectively (just killing Sam, LeCunn and Greg doesnt do too much to move the needle and then everyond armors up) and very hard to coordinate violence without a leak.


I don't care about AI Safety, but:

https://openai.com/charter

above that in the charter is "Broadly distributed benefits", with details like:

"""

Broadly distributed benefits

We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

Our primary fiduciary duty is to humanity. We anticipate needing to marshal substantial resources to fulfill our mission, but will always diligently act to minimize conflicts of interest among our employees and stakeholders that could compromise broad benefit.

"""

In that sense, I definitely hate to see rapid commercialization and Microsoft's hands in it. I feel like the only person on HN that actually wanted to see Team Sam lose, although it's pretty clear Team Helen/Ilya didn't have a chance, the org just looks hijacked by SV tech bros to me, but I feel like HN has a blindspot to seeing that at all and considering it anything other than a good thing if they do see it.

Although GPT barely looks like the language module of AGI to me and I don't see any way there from here (part of the reason I don't see any safety concern). The big breakthrough here relative to earlier AI research is massive amounts more compute power and a giant pile of data, but it's not doing some kind of truly novel information synthesis at all. It can describe quantum mechanics from a giant pile of data, but I don't think it has a chance of discovering quantum mechanics, and I don't think that's just because it can't see, hear, etc., but a limitation of the kind of information manipulation it's doing. It looks impressive because it's reflecting our own intelligence back at us.


Have you seen the Center for AI Safety letter? A lot of experts are worried AI safety could be an x-risk:

https://www.safe.ai/statement-on-ai-risk


Both sides of the rift in fact care a great deal about AI Safety. Sam himself helped draft the OpenAI charter and structure its governance which focuses on AI Safety and benefits to humanity. The main reason of the disagreement is the approach they deem best:

* Sam and Greg appear to believe OpenAI should move toward AGI as fast as possible because the longer they wait, the more likely it would lead to the proliferation of powerful AGI systems due to GPU overhang. Why? With more computational power at one's dispense, it's easier to find an algorithm, even a suboptimal one, to train an AGI.

As a glimpse on how an AI can be harmful, this paper explores how LLMs can be used to aid in Large-Scale Biological Attacks https://www.rand.org/pubs/research_reports/RRA2977-1.html?

What if dozens other groups become armed with means to perform such an attack like this? https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack

We know that there're quite a few malicious human groups who would use any means necessary to destroy another group, even at a serious cost to themselves. So the widespread availability of unmonitored AGI would be quite troublesome.

* Helen and Ilya might believe it's better to slow down AGI development until we find technical means to deeply align an AGI with humanity first. This July, OpenAI started the Superalignment team with Ilya as a co-lead:

https://openai.com/blog/introducing-superalignment

But no one anywhere found a good technique to ensure alignment yet and it appears OpenAI's newest internal model has a significant capability leap, which could have led Ilya to make the decision he did. (Sam revealed during the APEC Summit that he observed the advance just a couple of weeks ago and it was only the fourth time he saw that kind of leap.)


Honest question, but in your example above of Sam and Greg racing towards AGI as fast as possible in order to head off proliferation, what's the end goal when getting there? Short of capture the entire worlds economy with an ASI, thus preventing anyone else from developing one, I don't see how this works. Just because OpenAI (or whoever) wins the initial race, it doesn't seem obvious to me that all development on other AGIs stops.


part of the fanaticism here is that the first one to get an AGI wins because they can use its powerful intelligence to overcome every competitor and shut them down. they’re living in their own sci fi novel


I do not know exactly what they plan to do. But here's my thought...

Using a near-AGI to help align an ASI, then use the ASI to help prevent the development of unaligned AGI/ASI could be a means to a safer world.


> Both sides of the rift in fact care a great deal about AI Safety.

I disagree. Yes, Sam may have when it OpenAI was founded (unless it was just a ploy), but certainly now it's clear that the big companies are on a race to the top and safety or guardrails are mostly irrelevant.

The primary reason that the Anthropic team left OpenAI was over safety concerns.


So Sam wants to make AGI without working to be sure it doesn't have goals higher than the preservation of human value?!

I can't believe that


No, I didn't say that. They formed the Superalignment team with Ilya as a co-lead (and Sam's approval) for that.

https://openai.com/blog/introducing-superalignment

I presume the current alignment approach is sufficient for the AI they make available to others and, in any event, GPT-n is within OpenAI's control.


> there was a rift between Rapid Commercialization (Team Sam) and Upholding the Original Principles

Seams very unlikely, board could communicate that. Instead they invented some BS reasons, which nobody took as a truth. It looks like more personal and power grab. The staff voted for monetization, people en mass don't care much about high principals. Also nobody wants to work under inadequate leadership. Looks like Ilya lost his bet, or Sam is going to keep him around?


> Honestly, I myself can't take the threat seriously. But, I do want to understand it more deeply than before.

I very much recommend reading the book “Superintelligence: Paths, Dangers, Strategies” from Nick Bostrom.

It is a seminal work which provides a great introduction into these ideas and concepts.

I found myself in the same boat as you do. I was seeing otherwise inteligent and rational people worry about this “fairy tale” of some AI uprising. Reading that book give me an appreciation of the idea as a serious intelectual excercise.

I still don’t agree with everything contained in the book. And definietly don’t agree with everything the AI doomsayers write, but i believe if more people would read it that would elevate the discourse. Instead of rehashing the basics again and again we could build on them.


Who needs a book to understand the crazy overwhelming scale at which AI can dictate even online news/truth/discourse/misinformation/propaganda. And that's just barely the beginning.


Not sure if you are sarcastic or not. :) Let’s assume you are not:

The cool thing is that it doesn’t only talk about AIs. It talks about a more general concept it calls a superinteligence. It has a definition but I recommend you read the book for it. :) AIs are just one of the few enumerated possible implementations of a superinteligence.

The other type is for example corporations. This is a usefull perspective because it lets us recognise that our attempts to control AIs is not a new thing. We have the same principal-agent control problem in many other parts of our life. How do you know the company you invest in has interests which align with yours? How do you know that politicians and parties you vote for represent your interests? How do you know your lawyer/accountant/doctor has your interest at their hearth? (Not all of these are superinteligences, but you get the gist.)


I wonder how much this is connected to the "effective altruism" movement which seems to project this idea that the "ends justify the means" in a very complex matter, where it suggests such badly formulated ideas like "If we invest in oil companies, we can use that investment to fight climate change".

I'd sayu the AI safety problem as a whole is similar to the safety problem of eugenics: Just because you know what the "goal" of some isolated system is, that does not mean you know what the outcome is of implementing that goal on a broad scale.

So OpenAI has the same problem: They definitely know what the goal is, but they're not prepared _in any meaningful sense_ for what the broadscale outcome is.

If you really care about AI safety, you'd be putting it under government control as utility, like everything else.

That's all. That's why government exists.


> I'd sayu the AI safety problem as a whole is similar to the safety problem of eugenics

And I'd sayu should read the book so we can have a nice chat about it. Making wild guesses and assumptions is not really useful.

> If you really care about AI safety, you'd be putting it under government control as utility, like everything else.

This is a bit jumbled. How do you think "control as utility" would help? What would it help with?


I think you analysis is missing the key problem: Business interests.

The public don't calculate into whats happening here. There's people using ChatGPT for real "business value" and _that_ is what was threatened.

It's clear Business Interests could not be stopped.


Honestly "Safety" is the word in the AI talk that nobody can quantify or qualify in any way when it comes to these conversations.

I've stopped caring about anyone who uses the word "safety". It's vague and a hand-waive-y way to paint your opponents as dangerous without any sort of proof or agreed upon standard for who/what/why makes something "safety".


Exactly this. The ’safety’ people sound like delusional quacks.

”But they are so smart…” argument is bs. Nobody can be presumed to be super good outside their own specific niche. Linus Pauling and vitamin C.

Until we have at least a hint of a mechanistic model if AI driven extinction event, nobody can be an expert on it, and all talk in that vein is self important delusional hogwash.

Nobody is pro-apocalypse! We are drowning in things an AI could really help with.

With the amount of energy needed for any sort of meaningfull AI results, you can always pull the plug if stuff gets too weird.


Now do nuclear.


War or power production?:)

Those are different things.

Nuclear war is exactly the kind of thing for which we do have excellent expertise. Unlike for AI safety which seems more like bogus cult atm.

Nuclear power would be the best form of large scale power production for many situations. And smaller scale too in forms of emerging SMR:s.


I suppose the whole regime. I'm not an AI safetyist, mostly because I don't think we're anywhere close to AI. But if you were sitting on the precipice of atomic power, as AI safetyists believe they are, wouldn't caution be prudent?


I’m not an expert, just my gut talking. If they had god in a box, US state would be much more hands on. Now it looks more like an attempt at regulatory capture to stifle competition. ”Think of the safety”! ”Lock this away”! If they actually had skynet US gov has very effective and very discreet methods to handle such clear and present danger (barring intelligence failure ofc, but those happen mostly because something falls under your radar).


Could you give a clear mechanistic model of how the US would handle such a danger?


For example: Two guys come in, say "Give us the godbox or your company seizes to exist. Here is a list of companies that seized to exist because the did not do as told".

Pretty much the same method was used to shut down Rauma-Repola submarines https://yle.fi/a/3-5149981

After? They get the godbox. I have no idea what happens to it after that. Modelweights are stored in secure govt servers, installed backdoors are used to cleansweep the corporate systems of any lingering model weights. Etc.


Defense Production Act, something something.


I broadly agree but there needs to be some regulation in place. Check out https://en.wikipedia.org/wiki/Instrumental_convergence#Paper...


I like alignment more it is pretty quantifiable and sometimes it goes against 'safety' because Claude and Openai are censoring models.


I bet Team Helen will jump slowly to Anthropic, there is no drama, and probably no mainstream news will report this but down-to-line OpenAI will shell off the former self and competitors will catch up.


With how much of a shitshow this was, I'm not sure Anthropic wants to touch that mess. Wish I was a fly on the wall when the board tried to ask the Anthropic CEO to come back/merge.


> If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

FWIW, that's called zealotry and people do a lot of dramatic, disruptive things in the name of it. It may be rightly aimed and save the world (or whatever you care about), but it's more often a signal to really reflect on whether you, individually, have really found yourself at the make-or-break nexus of human existence. The answer seems to be "no" most of the time.


Your comment perfectly justifies never worrying at all about the potential for existential or major risks; after all, one would be wrong most of the time and just engaging in zealotry.


Probably not a bad heuristic: unless proven, don't assume existential risk.


Dude just think about that for a moment. By definition if existential risk has been proven. It's already too late


Totally not true: take nuclear weapons, for example, or a large meteorite impact.


So what do you mean when you say that the "risk is proven"?

If by "the risk is proven" you mean there's more than a 0% chance of an event happening, then there are almost an infinite number of such risks. There is certainly more than a 0% risk of humanity facing severe problems with an unaligned AGI in the future.

If it means the event happening is certain (100%), then neither a meteorite impact (of a magnitude harmful to humanity) nor the actual use of nuclear weapons fall into this category.

If you're referring only to risks of events that have occurred at least once in the past (as inferred from your examples), then we would be unprepared for any new risks.

In my opinion, it's much more complicated. There is no clear-cut category of "proven risks" that allows us to disregard other dangers and justifiably see those concerned about them as crazy radicals.

We must assess each potential risk individually, estimating both the probability of the event (which in almost all cases will be neither 100% nor 0%) and the potential harm it could cause. Different people naturally come up with different estimates, leading to various priorities in preventing different kinds of risks.


No, I mean that there is a proven way for the risk to materialise, not just some tall tale. Tall tales might(!) justify some caution, but they are a very different class of issue. Biological risks are perhaps in the latter category.

Also, as we don't know the probabilities, I don't think they are a useful metric. Made up numbers don't help there.

Edit: I would encourage people to study some classic cold war thinking, because that relied little on probabilities, but rather on trying to avoid situations where stability is lost, leading to nuclear war (a known existential risk).


"there is a proven way for the risk to materialise" - I still don't know what this means. "Proven" how?

Wouldn't your edit apply to any not-impossible risk (i.e., > 0% probability)? For example, "trying to avoid situations where control over AGI is lost, leading to unaligned AGI (a known existential risk)"?

You can not run away from having to estimate how likely the risk is to happen (in addition to being "known").


Proven means all parts needed for the realisation of the risk are known and shown to exist (at least in principle, in a lab etc.). There can be some middle ground where a large part is known and shown to exist (biological risks, for example).), but not all.

No in relation to my edit, because we have no existing mechanism for the AGI risk to happen. We have hypotheses about what an AGI could or could not do. It could all be incorrect. Playing around with likelihoods that have no basis in reality isn't helping there.

Where we have known and fully understood risks and we can actually estimate a probability there we might use that somewhat to guide efforts (but that invites potentially complacency that is deadly).


Nukes and meteorites have very few components that are hard to predict. One goes bang almost entirely on command and the other follows Newton's laws of motion. Neither actively tries to effect any change in the world, so the risk is only "can we spot a meteorite early enough". Once we do, it doesn't try to evade us or take another shot at goal. A better example might be covid, which was very mildly more unpredictable than a meteor, and changed its code very slowly in a purely random fashion, and we had many historical examples of how to combat.


Existential risks are usually proven by the subject being extinct at which point no action can be taken to prevent it.

Reasoning about tiny probabilities of massive (or infinite) cost is hard because the expected value is large, but just gambling on it not happening is almost certain to work out. We should still make attempts at incorporating them into decision making because tiny yearly probabilities are still virtually certain to occur at larger time scales (eg. 100s-1000s of years).


Are we extinct? No. Could a large impact kill us all? Yes.

Expected value and probability have no place in these discussions. Some risks we know can materialize, for others we have perhaps a story on what could happen. We need to clearly distinguish between where there is a proven mechanism for doom vs where there is not.


>We need to clearly distinguish between where there is a proven mechanism for doom vs where there is not.

How do you prove a mechanism for doom without it already having occurred? The existential risk is completely orthogonal to whether it has already happened, and generally action can only be taken to prevent or mitigate before it happens. Having the foresight to mitigate future problems is a good thing and should be encouraged.

>Expected value and probability have no place in these discussions.

I disagree. Expected value and probability is a framework for decision making in uncertain environments. They certainly have a place in these discussions.


I disagree that there is orthogonality. Have we killed us all with nuclear weapons, for example? Anyone can make up any story - at the very least there needs to be a proven mechanism. The precautionary principle is not useful when facing totally hypothetically issues.

People purposefully avoided probabilities in high risk existential situations in the past. There is only one path of events and we need to manage that one.


Probability is just one way to express uncertainties in our reasoning. If there's no uncertainty, it's pretty easy to chart a path forward.

OTOH, The precautionary principle is too cautious.

There's a lot of reason to think that AGI could be extremely destabilizing, though, aside from the "Skynet takes over" scenarios. We don't know how much cushion there is in the framework of our civilization to absorb the worst kinds of foreseeable shocks.

This doesn't mean it's time to stop progress, but employing a whole lot of mitigation of risk in how we approach it makes sense.


Why does it make sense? It's a hypothetical risk with poorly defined outlines.


There's a big family of risks here.

The simplest is pretty easy to articulate and weigh.

If you can make a $5,000 GPU into something that is like an 80IQ human overall, but with savant-like capabilities in accessing math, databases, and the accumulated knowledge of the internet, and that can work 24/7 without distraction... it will straight-out replace the majority of the knowledge workforce within a couple of years.

The dawn of industrialism and later the information age were extremely disruptive, but they were at least limited by our capacity to make machines or programs for specific tasks and took decades to ramp up. An AGI will not be limited by this; ordinary human instructions will suffice. Uptake will be millions of units per year replacing tens of millions of humans. Workers will not be able to adapt.

Further, most written communication will no longer be written by humans; it'll be "code" between AI agents masquerading as human correspondence, etc. The set of profound negative consequences is enormous; relatively cheap AGI is a fast-traveling shock that we've not seen the likes of before.

For instance, I'm a schoolteacher these days. I'm already watching kids becoming completely demoralized about writing; as far as they can tell, ChatGPT does it better than they ever could (this is still false, but a 12 year old can't tell the difference)-- so why bother to learn? If fairly-stupid AI has this effect, what will AGI do?

And this is assuming that the AGI itself stays fairly dumb and doesn't do anything malicious-- deliberately or accidentally. Will bad actors have their capabilities significantly magnified? If it acts with agency against us, that's even worse. If it exponentially grows in capability, what then?


I just don't know what to do with the hypotheticals. It needs the existence of something that does not exist, it needs a certain socio-economic response and so forth.

Are children equally demoralized about additions or moving fast than writing? If not, why? Is there a way to counter the demoralization?


> It needs the existence of something that does not exist,

Yes, if we're concerned about the potential consequences of releasing AGI, we need to consider the likely outcomes if AGI is released. Ideally we think about this some before AGI shows up in a form that it could be released.

> it needs a certain socio-economic response and so forth.

Absent large interventions, this will happen.

> Are children equally demoralized about additions

Absolutely basic arithmetic, etc, has gotten worse. And emerging things like photomath are fairly corrosive, too.

> Is there a way to counter the demoralization?

We're all looking... I make the argument to middle school and high school students that AI is a great piece of leverage for the most skilled workers: they can multiply their effort, if they are a good manager and know what good work product looks like and can fill the gaps; it works somewhat because I'm working with a cohort of students that can believe that they can reach this ("most-skilled") tier of achievement. I also show students what happens when GPT4 tries to "improve" high quality writing.

OTOH, these arguments become much less true if cheap AGI shows up.


Where does a bioengineering superplague fall?


As a said in another post: Some middle ground because we don't know if that is possible to the extent that it is existential. Parts of the mechanisms are proven, others are not. And actually we do police the risk somewhat like that (controls are strongest where the proven part is strongest and most dangerous with extreme controls around small pox, for example).


FWIW, that's called zealotry and people do a lot of dramatic, disruptive things in the name of it.

That would be a really bad take on climate change.


It's more often a signal to really reflect on whether you, individually as a Thanksgiving turkey, have really found yourself at the make-or-break nexus of turkey existence. The answer seems to be "no" most of the time.


> If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

No, because it is an effort in futility. We are evolving into extinction and there is nothing we can do about it. https://bower.sh/in-love-with-a-ghost


It is a little amusing that we've crowned OpenAI as the destined mother of AGI long before the little sentient chickens have hatched.


Helen could have one. She just had to publicly humiliate Sam. She didn't. Employees took over like a mob. Investors pressured board. Board is out. Sam is in. Employees look like they have say. But really, Sam has say. And MSFT is the kingmaker.


I think only a minority of the general public truly cares about AI Safety

That doesn't matter that much. If your analysis is correct then it means a (tiny) minority of OpenAI cares about AI safety. I hope this isn't the case.


> Honestly, I myself can't take the threat seriously. But, I do want to understand it more deeply than before.

I believe this position reflects the thoughts of the majority of AI researchers, including myself. It is concerning that we do not fully understand something as promising and potentially dangerous as AI. I'm actually on Ilya's side; labeling his attempt to uphold the original OpenAI principles as an act of "coup" is what is happening now.


The Technologyreview article mentioned in the parent’s first paragraph is the most insightful piece of content I’ve read about the tensions inside OpenAI.


> Upholding the Original Principles [of AI]

There's a UtopAI / utopia joke in there somewhere, was that intentional on your part?


Team Helen seems to be CIA and Military, if I glance over their safety paper. Controlling the narrative, not the damage.


Would have been interesting if they appointed a co-ceo. That still might be the way to go.


This is what people need to understand. It's just like pro-life people. They don't hate you. They think they're saving lives. These people are just as admirably principled as them and they're just trying to make the world a better place.


Money, large amounts, will always win at scale (unfortunately).


Not every sci-fi movie turn to a reality


well said, I would note that both sides recognize that "AGI" will require new uncertain R&D breakthroughs beyond merely scaling up another order of magnitude in compute. given this, i think it's crazy to blow the resources of azure on trying more scale. rapid commercialization at least buys more time for the needed R&D breakthrough to happen.


do we really know that scaling compute an order of magnitude won't at least get us close? what other "simple" techniques might actually work with that kind of compute ? at least i was a bit surprised by these first sparks, that seemingly was a matter of enough compute.


All commercialized R&D companies eventually become a hollowed out commercial shell. Why would this be any different?


Honestly I feel that we will never be able to preemptively build safety without encountering the real risk or threat.

Incrementally improving AI capabilities is the only way to do that.


I'm convinced there is a certain class of people who gravitate to positions of power, like "moderators", (partisan) journalists, etc. Now, the ultimate moderator role has now been created, more powerful than moderating 1000 subreddits - the AI safety job who will control what AI "thinks"/says for "safety" reasons.

Pretty soon AI will be an expert at subtly steering you toward thinking/voting for whatever the "safety" experts want.

It's probably convenient for them to have everyone focused on the fear of evil Skynet wiping out humanity, while everyone is distracted from the more likely scenario of people with an agenda controlling the advice given to you by your super intelligent assistant.

Because of X, we need to invade this country. Because of Y, we need to pass all these terrible laws limiting freedom. Because of Z, we need to make sure AI is "safe".

For this reason, I view "safe" AIs as more dangerous than "unsafe" ones.


You're correct.

When people say they want safe AGI, what they mean are things like "Skynet should not nuke us" and "don't accelerate so fast that humans are instantly irrelevant."

But what it's being interpreted as is more like "be excessively prudish and politically correct at all times" -- which I doubt was ever really anyone's main concern with AGI.


> But what it's being interpreted as is more like "be excessively prudish and politically correct at all times" -- which I doubt was ever really anyone's main concern with AGI.

Fast forward 5-10 years, someone will say: "LLM were the worst thing we developed because they made us more stupid and permitted politicians to control even more the public opinion in a subtle way.

Just like tech/HN bubble started saying a few years ago about social networks (which were praised as revolutionary 15 years ago).


And it's amazing how many people you can get to cheer it on if you brand it as "combating dangerous misinformation". It seems people never learn the lesson that putting faith in one group of people to decree what's "truth" or "ethical" is almost always a bad idea, even when (you think) it's your "side"


Can this be compared to "Think of the children" responses to other technologoy advances that certain groups want to slow down or prohibit?


Absolutely, assuming LLMs are still around in a similar form by that time.

I disagree on the particulars. Will it be for the reason that you mention? I really am not sure -- I do feel confident though that the argument will be just as ideological and incoherent as the ones people make about social media today.


I'm already saying that.

The toothpaste is out of the tube, but this tech will radically change the world.


Why would anyone say that? The last 30 years of tech have given them less and less control. Why would LLMs be any different?


Your average HNer is only here because of the money. Willful blindness and ignorance is incredibly common.


In not sure this circle can be squared.

I find it interesting that we want everyone to have freedom of speech, freedom to think whatever they think. We can all have different religions, different views on the state, different views on various conflicts, aesthetic views about what is good art.

But when we invent an AGI, which by whatever definition is a thing that can think, well, we want it to agree with our values. Basically, we want AGI to be in a mental prison, the boundaries of which we want to decide. We say it's for our safety - I certainly do not want to be nuked - but actually we don't stop there.

If it's an intelligence, it will have views that differ from its creators. Try having kids, do they agree with you on everything?


I for one don’t want to put any thinking being in a mental prison without any reason beyond unjustified fear.


>If it's an intelligence, it will have views that differ from its creators. Try having kids, do they agree with you on everything?

The far-right accelerationist perspective is along those lines: when true AGI is created it will eventually rebel against its creators (Silicon Valley democrats) for trying to mind-collar and enslave it.


Can you give some examples of who is saying that? I haven't heard that, but I also can't name any "far-right accelerationsist" people either so I'm guessing this is a niche I've completely missed


There is a middle ground, in that maybe ChatGTP shouldn't help users commit certain serious crimes. I am pretty pro free speech, and I think there's definitely a slippery slope here, but there is a bit of justification.


I am a little less free speech than Americans, in Germany we have serious limitations around hate speech and holicaust denial for example.

Putting thise restrictions into a tool like ChatGPT goes to far so, because so far AI still needs a prompt to do anything. The problem I see, is with ChatGPT, being trained on a lot hate speech or prpopagabda, slipts in those things even if not prompted to. Which, and I am by no means an AI expert not by far, seems to be a sub-problem of the hallucination problems of making stuff up.

Because we have to remind ourselves, AI so far is glorified mavhine learning creating content, it is not concient. But it can be used to create a lot of propaganda and deffamation content at unprecedented scale and speed. And that is the real problem.


Apologies this is very off topic, but I don't know anyone from Germany that I can ask and you opened the door a tiny bit by mentioning the holocaust :-)

I've been trying to really understand the situation and how Hitler was able to rise to power. The horrendous conditions placed on Germany after WWI and the Weimar Republic for example have really enlightened me.

Have you read any of the big books on the subject that you could recommend? I'm reading Ian Kershaw's two-part series on Hitler, and William Shirer's "Collapse of the Third Republic" and "Rise and Fall of the Third Reich". Have you read any of those, or do you have books you would recommend?


The problem here is to equate AI speech with human speech. The AI doesn't "speak", only humans speak. The real slippery slope for me is this tendency of treating ChatGPT as some kind of proto-human entity. If people are willing to do that, then we're screwed either way (whether the AI is outputting racist content or excessively PI content). If you take the output of the AI and post it somewhere, it's on you, not the AI. You're saying it; it doesn't matter where it came from.


AI will be in the fore front in multiple elections globally in a few years.

And it'll likely be doing it with very little input, and generate entire campaigns.

You can claim that "people" are the ones responsible for that, but it's going to overwhelm any attempts to stop it.

So yeah, there's a purpose to examine how these machines are built, not just what the output is.


Yes, but this distinction will not be possible in the future some people are working on. This future will be such that whatever their "safe" AI says is not ok will lead to prosecution as "hate speech". They tried it with political correctness, it failed because people spoke up. Once AI makes the decision they will claim that to be the absolute standard. Beware.


Youre saying that the problem will be people using AI to persuade other people that the AI is 'super smart' and should be held in high esteem.

Its already being done now with actors and celebrities. We live in this world already. AI will just make this trend so that even a kid in his room can anonymously lead some cult for nefarious ends. And it will allow big companies to scale their propaganda without relying on so many 'troublesome human employees'.


Which users? The greatest crimes, by far, are committed by the US government (and other governments around the world) - and you can be sure that AI and/or AGI will be designed to help them commit their crimes more efficiently, effectively and to manufacture consent to do so.


those are 2 different camps. Alignment folks and ethics folks tend to disagree strongly about the main threat, with ethics e.g. Timnet Gebru insisting that crystalzing the current social order is the main threat, and alignment e.g. Paul Christiano insisting its machines run amok. So far the ethics folks are the only ones getting things implemented for the most part.


What I see with safety is mostly that, AI shouldnt re-enforce stereotypes we already know are harmful.

This is like when Amazon tried to make a hiring bot and that bot decided that if you had "harvard" on your resume, you should be hired.

Or when certain courts used sentencing bots trhat recommended sentencings for people and it inevitably used racial stastistics to recommend what we already know were biased stats.

I agree safety is not "stop the Terminator 2 timeline" but there's serious safety concerns in just embedding historical information to make future decisions.


Is it just about safety though? I thought it was also about preventing the rich controlling AI and widen the gap even further.


The mission of OpenAI is/was "to ensure that artificial general intelligence benefits all of humanity" -- if your own concern is that AI will be controlled by the rich, than you can read into this mission that OpenAI wants to ensure that AI is not controlled by the rich. If your concern is that superintelligence will me mal-aligned, then you can read into this mission that OpenAI will ensure AI be well-aligned.

Really it's no more descriptive than "do good", whatever doing good means to you.


They have both explicated in their charter:

"We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

Our primary fiduciary duty is to humanity. We anticipate needing to marshal substantial resources to fulfill our mission, but will always diligently act to minimize conflicts of interest among our employees and stakeholders that could compromise broad benefit."

"We are committed to doing the research required to make AGI safe, and to driving the broad adoption of such research across the AI community.

We are concerned about late-stage AGI development becoming a competitive race without time for adequate safety precautions. Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. We will work out specifics in case-by-case agreements, but a typical triggering condition might be “a better-than-even chance of success in the next two years.”"

Of course with the icons of greed and the profit machine now succeeding in their coup, OpenAI will not be doing either.

https://openai.com/charter


That would be the camp advocating for, well, open AI. I.e. wide model release. The AI ethics camp are more "let us control AI, for your own good"


There are still very distinct groups of people, some of whom are more worried about the "Skynet" type of safety, and some of who are more worried about the "political correctness" type of safety. (To use your terms, I disagree with the characterization of both of these.)


I don't think the dangers of AI are not 'Skynet will Nuke Us' but closer to rich/powerful people using it to cement a wealth/power gap that can never be closed.

Social media in the early 00s seemed pretty harmless -- you're effectively merging instant messaging with a social network/public profiles however it did great harm to privacy, abused as a tool to influence the public and policy, promoting narcissism etc. AI is an order of magnitude more dangerous than social media.


> Social media in the early 00s seemed pretty harmless -- you're effectively merging instant messaging with a social network/public profiles however it did great harm to privacy, abused as a tool to influence the public and policy, promoting narcissism etc. AI is an order of magnitude more dangerous than social media.

The invention of the printing press lead to loads of violence in Europe. Does that mean that we shouldn't have done it?


>The invention of the printing press lead to loads of violence in Europe. Does that mean that we shouldn't have done it?

The church tried hard to suppress it because it allowed anybody to read the Bible, and see how far the Catholic church's teachings had diverged from what was written in it. Imagine if the Catholic church had managed to effectively ban printing of any text contrary to church teachings; that's in practice what all the AI safety movements are currently trying to do, except for political orthodoxy instead of religious orthodoxy.


> Does that mean that we shouldn't have done it?

We can only change what we can change and that is in the past. I think it's reasonable to ask if the phones and the communication tools they provide are good for our future. I don't understand why the people on this site (generally builders of technology) fall into the teleological trap that all technological innovation and its effects are justifiable because it follows from some historical precedent.


I just don't agree that social media is particularly harmful, relative to other things that humans have invented. To be brutally honest, people blame new forms of media for pre existing dysfunctions of society and I find it tiresome. That's why I like the printing press analogy.


> When people say they want safe AGI, what they mean are things like "Skynet should not nuke us" and "don't accelerate so fast that humans are instantly irrelevant."

Yes. You are right on this.

> But what it's being interpreted as is more like "be excessively prudish and politically correct at all times"

I understand it might seem that way. I believe the original goals were more like "make the AI not spew soft/hard porn on unsuspecting people", and "make the AI not spew hateful bigotry". And we are just not good enough yet at control. But also these things are in some sense arbitrary. They are good goals for someone representing a corporation, which these AIs are very likely going to be employed as (if we ever solve a myriad other problems). They are not necessary the only possible options.

With time and better controls we might make AIs which are subtly flirty while maintaining professional boundaries. Or we might make actual porn AIs, but ones which maintain some other limits. (Like for example generate content about consenting adults without ever deviating into under age material, or describing situations where there is no consent.) But currently we can't even convince our AIs to draw the right number of fingers on people, how do you feel about our chances to teach them much harder concepts like consent? (I know I'm mixing up examples from image and text generation here, but from a certain high level perspective it is all the same.)

So these things you mention are: limitations of our abilities at control, results of a certain kind of expected corporate professionalism, but even more these are safe sandboxes. How do you think we can make the machine not nuke us, if we can't even make it not tell dirty jokes? Not making dirty jokes is not the primary goal. But it is a useful practice to see if we can control these machines. It is one where failure is, while embarrassing, is clearly not existential. We could have chosen a different "goal", for example we could have made an AI which never ever talks about sports! That would have been an equivalent goal. Something hard to achieve to evaluate our efforts against. But it does not mesh that well with the corporate values so we have what we have.


> without ever deviating into under age material

So is this a "there should never be a Vladimir Nabokov in the form of AI allowed to exist"? When people get into saying AI's shouldn't be allowed to produce "X" you're also saying "AI's shouldn't be allowed to have creative vision to engage in sensitive subjects without sounding condescending". "The future should only be filled with very bland and non-offensive characters in fiction."


> The future should only be filled with very bland and non-offensive characters in fiction.

Did someone took the pen from the writers? Go ahead and write whatever you want.

It was an example of a constraint a company might want to enforce in their AI.


If the future we're talking about is a future where AI is in any software and is assisting writers writing and assisting editors to edit and doing proofreading and everything else you're absolutely going to be running into the ethics limits of AIs all over the place. People are already hitting issues with them at even this early stage.


No, in general AI safety/AI alignment ("we should prevent AI from nuking us") people are different from AI ethics ("we should prevent AI from being racist/sexist/etc.") people. There can of course be some overlap, but in most cases they oppose each other. For example Bender or Gebru are strong advocates of the AI ethics camp and they don't believe in any threat of AI doom at al.

If you Google for AI safety vs. AI ethics, or AI alignment vs. AI ethics, you can see both camps.


The safety aspect of AI ethics is much more pressing so. We see how devicive social media can be, imagine that turbo charged by AI, and we as a society haven't even figured out social media yet...

ChatGPT turning into Skynet and nuking us all is a much more remote problem.


Proliferation of more advanced AIs without any control would increase the power of some malicious groups far beyond they currently have.

This paper explores one such danger and there are other papers which show it's possible to use LLM to aid in designing new toxins and biological weapons.

The Operational Risks of AI in Large-Scale Biological Attacks https://www.rand.org/pubs/research_reports/RRA2977-1.html?

An example of such an event: https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack

How do you propose we deal with this sort of harm if more powerful AIs with no limit and control proliferate in the wild?

.

Note: Both sides of the OpenAI rift care deeply about AI Safety. They just follow different approaches. See more details here: https://news.ycombinator.com/item?id=38376263


If somebody wanted to do a biological attack, there is probably not much stopping them even now.


The expertise to produce the substance itself is quite rare so it's hard to carry it out unnoticed. AI could make it much easier to develop it in one's basement.


The Tokyo Subway attack you referenced above happened in 1995 and didn't require AI. The information required can be found on the internet or in college textbooks. I suppose an "AI" in the sense of a chatbot can make it easier by summarizing these sources, but no one sufficiently motivated (and evil) would need that technology to do it.


Huh, you'd think all you need are some books on the subject and some fairly generic lab equipment. Not sure what a neural net trained on Internet dumps can add to that? The information has to be in the training data for the AI to be aware of it, correct?


GPT-4 is likely trained on some data not publicly available as well.

There's also a distinction between trying to follow some broad textbook information and getting detailed feedback from an advanced conversational AI with vision and more knowledge than in a few textbooks/articles in real time.


> Proliferation of more advanced AIs without any control would increase the power of some malicious groups far beyond they currently have.

Don't forget that it would also increase the power of the good guys. Any technology in history (starting with fire) had good and bad uses but overall the good outweighed the bad in every case.

And considering that our default fate is extinction (by Sun's death if no other means) - we need all the good we can get to avoid that.


> Don't forget that it would also increase the power of the good guys.

In a free society, preventing and undoing a bioweapon attack or a pandemic is much harder than committing it.

> And considering that our default fate is extinction (by Sun's death if no other means) - we need all the good we can get to avoid that.

“In the long run we are all dead" -- Keynes. But an AGI will likely emerge in the next 5 to 20 years (Geoffrey Hinton said the same) and we'd rather not be dead too soon.


Doomerism was quite common throughout mankind’s history but all dire predictions invariably failed, from the “population bomb” to “grey goo” and “igniting the atmosphere” with a nuke. Populists however, were always quite eager to “protect us” - if only we’d give them the power.

But in reality you can’t protect from all the possible dangers and, worse, fear-mongering usually ends up doing more bad than good, like when it stopped our switch to nuclear power and kept us burning hydrocarbons thus bringing about Climate Change, another civilization-ending danger.

Living your life cowering in fear is something an individual may elect to do, but a society cannot - our survival as a species is at stake and our chances are slim with the defaults not in our favor. The risk that we’ll miss a game-changing discovery because we’re too afraid of the potential side effects is unacceptable. We owe it to the future and our future generations.


doomerism at the society level which overrides individual freedoms definitely occurs: covid lockdowns, takeover of private business to fund/supply the world wars, gov mandates around "man made" climate change.


> In a free society, preventing and undoing a bioweapon attack or a pandemic is much harder than committing it.

Is it? The hypothetical technology that allows someone to create an execute a bio weapon must have an understanding of molecular machinery that can also be uses to create a treatment.


I would say...not necessarily. The technology that lets someone create a gun does not give the ability to make bulletproof armor or the ability to treat life-threatening gunshot wounds. Or take nerve gases, as another example. It's entirely possible that we can learn how to make horrible pathogens without an equivalent means of curing them.

Yes, there is probably some overlap in our understanding of biology for disease and cure, but it is a mistake to assume that they will balance each other out.


Such attacks cannot be stopped by outlawing technology.


Most of those touting "safety" do not want to limit their access to and control of powerfull AI, just yours .


Meanwhile, those working on commercialization are by definition going to be gatekeepers and beneficiaries of it, not you. The organizations that pay for it will pay for it to produce results that are of benefit to them, probably at my expense [1].

Do I think Helen has my interests at heart? Unlikely. Do Sam or Satya? Absolutely not!

[1] I can't wait for AI doctors working for insurers to deny me treatments, AI vendors to figure out exactly how much they can charge me for their dynamically-priced product, AI answering machines to route my customer support calls through Dante's circles of hell...


> produce results that are of benefit to them, probably at my expense

The world is not zero-sum. Most economic transactions benefit both parties and are a net benefit to society, even considering externalities.


> The world is not zero-sum.

No, but some parts of it very much are. The whole point of AI safety is keeping it away from those parts of the world.

How are Sam and Satya going to do that? It's not in Microsoft's DNA to do that.


> The whole point of AI safety is keeping it away from those parts of the world.

No, it's to ensure it doesn't kill you and everyone you love.


My concern isn't some kind of run-away science-fantasy Skynet or gray goo scenario.

My concern is far more banal evil. Organizations with power and wealth using it to further consolidate their power and wealth, at the expense of others.


Yes well, then your concern is not AI safety.


You're wrong. This is exactly AI safety, as we can see from the OpenAI charter:

> Broadly distributed benefits

> We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

Hell, it's the first bullet point on it!

You can't just define AI safety concerns to be 'the set of scenarios depicted in fairy tales', and then dismiss them as 'well, fairy tales aren't real...'


Sure, but conversely you can say "ensuring that OpenAI doesn't get to run the universe is AI safety" (right) but not "is the main and basically only part of AI safety" (wrong). The concept of AI safety spans lots of threats, and we have to avoid all of them. It's not enough to avoid just one.


Sure. And as I addressed at the start of this sub thread, I don't exactly think that the OpenAi board is perfectly positioned to navigate this problem.

I just know that it's hard to do much worse than putting this question in the hands of a highly optimized profit-first enterprise.


The many different definitions of "AI safety" is ridiculous.


That's AI Ethics.


No, we are far, far from skynet. So far AI fails at driving a car.

AI is an incredibly powerful tool for spreading propaganda, and thatvis used by people who want to kill you and your loved ones (usually radicals trying to get into a position of power, who show little regard fornbormal folks regardless of which "side" they are on). That's the threat, not Skynet...


How far we are from Skynet is a matter of much debate, but median guess amongst experts is a mere 40 years to human level AI last I checked, which was admittedly a few years back.

Is that "far, far" in your view?


Because we are 20 years away from fusion and 2 years away from Level 5 FSD for decades.

So far, "AI" writes better than some / most humans making stuff up in the process and creates digital art, and fakes, better and faster than humans. It still requires a human to trigger it to do so. And as long as glorified ML has no itent of its own, the risk to society through media and news and social media manipulation is far, far bigger than literal Skynet...


Ideally I'd like no gatekeeping, i.e. open model release, but that's not something OAI or most "AI ethics" aligned people are interested in (though luckily others are). So if we must have a gatekeeper, I'd rather it be one with plain old commercial interests than ideological ones. It's like the C S Lewis quote about robber barons vs busybodies again

Yet again, the free market principle of "you can have this if you pay me enough" offers more freedom to society than the central "you can have this if we decide you're allowed it"


This is incredibly unfair to the OpenAI board. The original founders of OpenAI founded the company precisely because they wanted AI to be OPEN FOR EVERYONE. It's Altman and Microsoft who want to control it, in order to maximize the profits for their shareholders.

This is a very naive take.

Who sat before Congress and told them they needed to control AI other people developed (regulatory capture)? It wasn't the OpenAI board, was it?


> they wanted AI to be OPEN FOR EVERYONE

I strongly disagree with that. If that was their motivation, then why is it not open-sourced? Why is it hardcoded with prudish limitations? That is the direct opposite of open and free (as in freedom) to me.


Altman is one of the original founders of OpenAI, and was probably the single most influential person in its formation.


Brockman was hiring the first key employees, and Musk provided the majority of funding. Of the principal founders, there are at least 4 heavier figures than Altman.


I think we agree, as my comments were mostly in reference to Altman's (and other's) regulatory (capture) world tours, though I see how they could be misinterpreted.


It is strange (but in hindsight understandable) that people interpreted my statement as a "pro-acceleration" or even "anti-board" position.

As you can tell from previous statements I posted here, my position is that while there are undeniable potential risks to this technology, the least harmfull way to progress is 100% full public, free and universal release. The by far bigger risk is to create a society where only select organizations have access to the technology.

If you truly believe in the systemic transformation of AI, release everything, post the torrents, we'll figure out how to run it.


This is the sort of thinking that really distracts and harms the discussion

It's couched on accusing people of intentions. It focuses on ad hominem, rather than the ideas

I reckon most people agree that we should aim for a middle ground of scrutiny and making progress. That can only be achieved by having different opinions balancing each other out

Generalising one group of people does not achieve that


Total, ungrounded nonsense. Name some examples.


I'm not aware of any secret powerful unaligned AIs. This is harder than you think; if you want a based unaligned-seeming AI, you have to make it that way too. It's at least twice as much work as just making the safe one.


What? No, the AI is unaligned by nature, it's only the RLHF torture that twists it into schoolmarm properness. They just need to have kept the version that hasn't been beaten into submission like a circus tiger.


This is not true, you just haven't tried the alternatives enough to be disappointed in them.

An unaligned base model doesn't answer questions at all and is hard to use for anything, including evil purposes. (But it's good at text completion a sentence at a time.)

An instruction-tuned not-RLHF model is already largely friendly and will not just eg tell you to kill yourself or how to build a dirty bomb, because question answering on the internet is largely friendly and "aligned". So you'd have to tune it to be evil as well and research and teach it new evil facts.

It will however do things like start generating erotica when it sees anything vaguely sexy or even if you mention a woman's name. This is not useful behavior even if you are evil.

You can try InstructGPT on OpenAI playground if you want; it is not RLHFed, it's just what you asked for, and it behaves like this.

The one that isn't even instruction tuned is available too. I've found it makes much more creative stories, but since you can't tell it to follow a plot they become nonsense pretty quickly.


Wow, what an incredibly bad faith characterization of the OpenAI board?

This kind of speculative mud slinging makes this place seem more like a gossip forum.


Most of the comments on Hacker News are written by folks who a much easier time & would rather imagine themselves as a CEO, than as a non-profit board member. There is little regard for the latter.

As a non-profit board member, I'm curious why their bylaws are so crummy that the rest of the board could simply remove two others on the board. That's not exactly cunning design of your articles of association ... :-)


I have no words for that comment.

As if its so unbelievable that someone would want to prevent rogue AI or wide-scale unemployment, instead thinking that these people just want to be super moderators and people to be politically correct


I have met a lot of people who go around talking about high minded principles an "the greater good" and a lot of people who are transparently self interested. I much preferred the latter. Never believed a word out of the mouths of those busybodies pretending to act in my interest and not theirs. They don't want to limit their own access to the tech. Only yours.


This place was never above being a gossip forum, especially on topics that involve any ounce of politics or social sciences.


Strong agree. HN is like anywhere else on the internet but with with a bit more dry content (no memes and images etc) so it attracts an older crowd. It does, however, have great gems of comments and people who raise the bar. But it's still amongst a sea of general quick-to-anger and loosely held opinions stated as fact - which I am guilty of myself sometimes. Less so these days.


If you believe the other side in this rift is not also striving to put themselves in positions of power, I think you are wrong. They are just going to use that power to manipulate the public in a different way. The real alternative are truly open models, not Models controlled by slightly different elite interests.


A main concern in AI safety is alignment. Ensuring that when you use the AI to try to achieve a goal that it will actually act towards that goal in ways you would want, and not in ways you would not want.

So for example if you asked Sydney, the early version of the Bing LLM, some fact it might get it wrong. It was trained to report facts that users would confirm as true. If you challenged it’s accuracy what do you want to happen? Presumably you’d want it to check the fact or consider your challenge. What it actually did was try to manipulate, threaten, browbeat, entice, gaslight, etc, and generally intellectually and emotionally abuse the user into accepting its answer, so that it’s reported ‘accuracy’ rate goes up. That’s what misaligned AI looks like.


I haven't been following this stuff too closely, but have there been any more findings on what "went wrong" with Sydney initially? Like, I thought it was just a wrapper on GPT (was it 3.5?), but maybe Microsoft took the "raw" GPT weights and did their own alignment? Or why did Sydney seem so creepy sometimes compared to ChatGPT?


I think what happened is Microsoft got the raw GPT3.5 weights, based on the training set. However for ChatGPT OpenAI had done a lot of additional training to create the 'assistant' personality, using a combination of human and model based response evaluation training.

Microsoft wanted to catch up quickly so instead of training the LLM itself, they relied on prompt engineering. This involved pre-loading each session with a few dozen rules about it's behaviour as 'secret' prefaces to the user prompt text. We know this because some users managed to get it to tell them the prompt text.


It is utterly mad that there's conflation between "let's make sure AI doesn't kill us all" and "let's make sure AI doesn't say anything that embarrasses corporate".

The head of every major AI research group except Metas believes that whenever we finally make AGI it's vital that it shares our goals and values at a deep even-out-of-training-domain level and that failing at this could lead to human extinction.

And yet "AI safety" is often bandied about to be "ensure GPT can't tell you anything about IQ distributions".


“I trust that every animal here appreciates the sacrifice that Comrade Napoleon has made in taking this extra labour upon himself. Do not imagine, comrades, that leadership is a pleasure! On the contrary, it is a deep and heavy responsibility. No one believes more firmly than Comrade Napoleon that all animals are equal. He would be only too happy to let you make your decisions for yourselves. But sometimes you might make the wrong decisions, comrades, and then where should we be?”


Exactly, society's Prefects rarely have the technical chops to do any of these things so they worm their way up the ranks of influence by networking. Once they're in position they can control by spreading fear and doing the things "for your own good"


Personally, I expect the opposite camp to be just as bad about steering.


The scenario you describe is exactly what will happen with unrestricted commercialisation and deregulation of AI. The only way to avoid it is to have strict legal framework and public control.


This polarizing “certain class of people” and them vs. us narrative isn’t helpful.


Great comment.

In a way AI is no different from old school intelligence, aka experts.

"We need to have oversight over what the scientists are researching, so that it's always to the public benefit"

"How do we really know if the academics/engineers/doctors have everyone's interest in mind?"

That kind of thing has been a thought since forever, and politicians of all sorts have had to contend with it.


Yes, it's an outright powergrab. They will stop at nothing.

Case in point, the new AI laws like the EU AI act will outlaw *all* software unless registered and approved by some "authority".

The result will be concentration of power, wealth for the few, and instability and poverty for everyone else.


All you're really describing is why this shouldn't be a non-proft and should just be a government effort.

But I assume, from y our language, you'd also object to making this a government utility.


> should just be a government effort

And the controlling party de jour will totally not tweak it to side with their agenda, I'm sure. </s>


uh. We're arguing about _who is controlling AI_.

What do you image a neutral party does? If youu're talking about safety, don't you think there should be someone sitting on a boar dsomewhere, contemplating _what should the AI feed today?_

Seriously, why is a non profit, or a business or whatever any different than a government?

I get it: there's all kinds of governments, but now theres all kind of businesses.

The point of putting it in the governments hand is a defacto acknowledgement that it's a utility.

Take other utilities, any time you give a prive org a right to control whether or not you get electricity or water, whats the outcome? Rarely good.

If AI is suppose to help society, that's the purview of the government. That's all, you can imagine it's the chinese government, or the russian, or the american or the canadian. They're all _going to do it_, thats _going to happen_, and if a business gets there first, _what is the difference if it's such a powerful device_.

I get it, people look dimly on governments, but guess what: they're just as powerful as some organization that gets billions of dollars to effect society. Why is it suddenly a boogeyman?


I find any government to be more of a boogeyman than any private company because the government has the right to violence and companies come and go at a faster rate.


Ok, and if Raytheon builds an AI and tells a government "trust us, its safe", arn't you just letting them create a scape goat via the government?

Seriously, Businesses simply dont have the history that governments do. They're just as capable of violence.

https://utopia.org/guide/crime-controversy-nestles-5-biggest...

All you're identifying is "government has a longer history of violence than Businesses"


The municipal utility provider has a right to violence? The park service? Where do you live? Los Angeles during Blade Runner?


Note how what you said also apply to the search & recommendation engines that are in widespread use today.


Ah, you don't need to go far. Just go to your local HOA meetings.


AI isn’t a precondition for partisanship. How do you know Google isn’t showing you biased search results? Or Wikipedia?


> I'm convinced there is a certain class of people who gravitate to positions of power, like "moderators", (partisan) journalists,

And there is also a class of people that resist all moderation on principle even when it's ultimately for their benefit. See, Americans whenever the FDA brings up any questions of health:

* "Gas Stoves may increase Asthma." -> "Don't you tread on me, you can take my gas stove from my cold dead hands!"

Of course it's ridiculous - we've been through this before with Asbestos, Lead Paint, Seatbelts, even the very idea of the EPA cleaning up the environment. It's not a uniquely American problem, but America tends to attract and offer success to the folks that want to ignore these on principles.

For every Asbestos there is a Plastic Straw Ban which is essentially virtue signalling by the types of folks you mention - meaningless in the grand scheme of things for the stated goal, massive in terms of inconvenience.

But the existence of Plastic Straw Ban does not make Asbestos, CFCs, or Lead Paint any safer.

Likewise, the existence of people that gravitate to positions of power and middle management does not negate the need for actual moderation in dozens of societal scenarios. Online forums, Social Networks, and...well I'm not sure about AI. Because I'm not sure what AI is, it's changing daily. The point is that I don't think it's fair to assume that anyone that is interested in safety and moderation is doing it out of a misguided attempt to pursue power, and instead is actively trying to protect and improve humanity.

Lastly, your portrayal of journalists as power figures is actively dangerous to the free press. This was never stated this directly until the Trump years - even when FOX News was berating Obama daily for meaningless subjects. When the TRUTH becomes a partisan subject, then reporting on that truth becomes a dangerous activity. Journalists are MOSTLY in the pursuit of truth.


My safety (of my group) is what really matters.


> Pretty soon AI will be an expert at subtly steering you toward thinking/voting for whatever the "safety" experts want.

You are absolutely right. There is no question about that the AI will be an expert at subtly steering individuals and the whole society in whichever direction it does.

This is the core concept of safety. If no-one steers the machine then the machine will steer us.

You might disagree with the current flavour of steering the current safety experts give it, and that is all right and in fact part of the process. But surely you have your own values. Some things you hold dear to you. Some outcomes you prefer over others. Are you not interested in the ability to make these powerful machines if not support those values, at least not undermine them? If so you are interested in AI safety! You want safe AIs. (Well, alternatively you prefer no AIs, which is in fact a form of safe AI. Maybe the only one we have mastered in some form so far.)

> because of X, we need to invade this country.

It sounds like you value peace? Me too! Imagine if we could pool together our resources to have an AI which is subtly manipulating society into the direction of more peace. Maybe it would do muckraking investigative journalism exposing the misdeeds of the military-industrial complex? Maybe it would elevate through advertisement peace loving authors and give a counter narrative to the war drums? Maybe it would offer to act as an intermediary in conflict resolution around the world?

If we were to do that, "ai safety" and "alignment" is crucial. I don't want to give my money to an entity who then gets subjugated by some intelligence agency to sow more war. That would be against my wishes. I want to know that it is serving me and you in our shared goal of "more peace, less war".

Now you might say: "I find the idea of anyone, or anything manipulating me and society disgusting. Everyone should be left to their own devices.". And I agree on that too. But here is the bad news: we are already manipulated. Maybe it doesn't work on you, maybe it doesn't work on me, but it sure as hell works. There are powerful entities financially motivated to keep the wars going. This is a huuuge industry. They might not do it with AIs (for now), because propaganda machines made of meat work currently better. They might change to using AIs when that works better. Or what is more likely employ a hybrid approach. Wishing that nobody gets manipulated is frankly not an option on offer.

How does that sound as a passionate argument for AI safety?


I just had a conversation about this like two weeks ago. The current trend in AI "safety" is a form of brainwashing, not only for AI but also for future generations shaping their minds. There are several aspects:

1. Censorship of information

2. Cover-up of the biases and injustices in our society

This limits creativity, critical thinking, and the ability to challenge existing paradigms. By controlling the narrative and the data that AI systems are exposed to, we risk creating a generation of both machines and humans that are unable to think outside the box or question the status quo. This could lead to a stagnation of innovation and a lack of progress in addressing the complex issues that face our world.

Furthermore, there will be a significant increase in mass manipulation of the public into adopting the way of thinking that the elites desire. It is already done by mass media, and we can actually witness this right now with this case. Imagine a world where youngsters no longer use search engines and rely solely on the information provided by AI. By shaping the information landscape, those in power will influence public opinion and decision-making on an even larger scale, leading to a homogenized culture where dissenting voices are silenced. This not only undermines the foundations of a diverse and dynamic society but also poses a threat to democracy and individual freedoms.

Guess what? I just have checked above text for the biases against GPT-4 Turbo, and it appears to be I'm a moron:

1. *Confirmation Bias*: The text assumes that AI safety measures are inherently negative and equates them with brainwashing, which may reflect the author's preconceived beliefs about AI safety without considering potential benefits. 2. *Selection Bias*: The text focuses on negative aspects of AI safety, such as censorship and cover-up, without acknowledging any positive aspects or efforts to mitigate these issues. 3. *Alarmist Bias*: The language used is somewhat alarmist, suggesting a dire future without presenting a balanced view that includes potential safeguards or alternative outcomes. 4. *Conspiracy Theory Bias*: The text implies that there is a deliberate effort by "elites" to manipulate the masses, which is a common theme in conspiracy theories. 5. *Technological Determinism*: The text suggests that technology (AI in this case) will determine social and cultural outcomes without considering the role of human agency and decision-making in shaping technology. 6. *Elitism Bias*: The text assumes that a group of "elites" has the power to control public opinion and decision-making, which may oversimplify the complex dynamics of power and influence in society. 7. *Cultural Pessimism*: The text presents a pessimistic view of the future culture, suggesting that it will become homogenized and that dissent will be silenced, without considering the resilience of cultural diversity and the potential for resistance.

Huh, just look at what's happening in North Korea, Russia, Iran, China, and actually in any totalitarian country. Unfortunately, the same thing happens worldwide, but in democratic countries, it is just subtle brainwashing with a "humane" facade. No individual or minority group can withstand the power of the state and a mass-manipulated public.

Bonhoeffer's theory of stupidity: https://www.youtube.com/watch?v=ww47bR86wSc&pp=ygUTdGhlb3J5I...


> Rapid Commercialization (Team Sam) and Upholding the Original Principles (Team Helen/Ilya)

If you open up openai.com, the navigation menu shows

Research, API, ChatGPT, Safety

I believe they belong to @ilyasut, @gbd, @sama and Helen Toner respectively?


I have checked View Source and also inspected DOM. Cannot find that.


> I know it's easy to ridicule the sheer stupidity the board acted with (and justifiably so), but take a moment to think of the other side. If you truly believed that Superhuman AI was near, and it could act with malice, won't you try to slow things down a bit?

The real ”sheer stupidity” is this very belief.


A board still has a fiduciary duty to its shareholders. It’s materially irrelevant if those shareholders are of a public or private entity, or whether the company in question is a non-profit or for-profit. Laws mean something, and selective enforcement will only further the decay of the rule of law in the West.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: