torturing a model with human stupidity probably doesn't align with their position on model welfare; wondering if they tried bullying it into hacking its way out of the slop gulag
So if all the AI code is being reviewed by humans (not sure this is true, but let's assume it is), then why are there 5000+ bugs? Are you blaming the Anthropic developers rather than the AI?
Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.
I don’t doubt they have found interesting security holes, the question is how they actually found them.
This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.
I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
> I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
Tell me how this will replace Jira, planning, convincing PM's about viability. Programming is only a part of the job devs are doing.
AI psychosis is truly next level in these threads.
Have you never filed JIRA tickets, planned, or debated viability with an AI? Which part of those are you finding that an AI absolutely cannot do better than the average developer?
I assure you it will soon become very clear that mass job losses are one of the least concerning side effects of developing the magic "everything that can plausibly been done within the constraints of physics is now possible" machine.
We're opening a can of worms which I don't think most people have the imagination to understand the horrors of.
While I'm definitely concerned that AI is a massive driver of centralization of power, at least in theory being able to do far more things in the space of "things physics admits to be possible" is massively wealth enhancing. That is literally how we have gotten from the pre-industrial world to today.
Controversially I'd argue that there is likely an optimal and stable level of technological advancement which we would be wise to not to cross. That said, we are human so we will, I'd just rather it happened in a couple hundred years rather than a decade or two.
For example, it's hard to imagine an AI which gives us the capability to cure cancer, but doesn't give us the capability to create target super viruses.
What sources would you even be looking for? I think you're asking the wrong question. It's not like I'm arguing a scientific theory which can be backed by data and experimentation. I can only provide you reasoning for why I believe what I believe.
Firstly, I'd propose that all technological advances are a product of time and intelligence, and that given unlimited time and intelligence, the discovery and application of new technologies is fundamentally only limited by resources and physics.
There are many technologies which might plausibly exist, but which we have not yet discovered because we only have so much intelligence and have only had so much time.
With more intelligence we should assume the discovery of new technologies will be much quicker – perhaps exponential if we consider the rate of current technology discovery and exponential progression of AI.
There are lots of technologies we have today which would seem like magic to people in the past. Future technologies likely exist which would make us feel this way were they available today.
While it's hard to predict specifically which technologies could exist soon in a world with ASI, if we assume it's within the bounds of available resources and physics, we should assume it's at least plausible.
Examples:
- Mind control – with enough knowledge about how the brain works you can likely devise sensory or electro-magnetic input that would manipulate the functioning of brain to either strongly influence or effectively dictate it's output.
- Mind simulation - again, with enough knowledge of the brain, you could take a snapshot of someones mind with an advanced electro-magnetic device and simulate it to torture them in parallel to reveal any secret, or just because you feel like doing it.
- Advantage torture – with enough knowledge of human biology death becomes optional in the future. New methods of torture which would have previously have killed the victim are now plausible. States like North-Korea can now force humans to work for hundreds of years in incomprehensible agony for opposing the state.
- Advanced biological weapons – with enough knowledge of virology sophisticated tailor-made viruses replace nerve agents as Russia's weapon of choice for killing those accused of treason. These viruses remain dormant in the host for months infecting them and people genetically similar to them (parents, children, grandchildren). After months, the virus rapidly kills its hosts in horrific ways.
I could go on, you just need to use your imagination. I'm not arguing any of the above are likely to be discovered, just that it would be very naive to think AI will stop at a cure for cancer. If it gives us cure for cancer, it will give us lots of things we might wish it didn't.
You are supposing it's possible to know that much about some things that maybe are not knowledgeable to us, even with these tools. Life is extremely complex, more than it's typically assumed by engineering-minded people. Let's be humble here and acknowledge it.
Why couldn't it be unknowable? I am not saying that it is, but it could be. The human brain has its limits and things could me too complex for us to understand enough to be able to modify them at will. We could understand a lot, but not enough to manipulate it with certainty. Biology is not physics.
Why not? Human mind has its limits. The complexity of physics is orders of magnitude smaller than biology, let alone any kind of social science. Physics is the exception, not the rule. The rest of sciences are way more messy.
Almost anything outside physics is not predictable. Anything that involves human behavior is totally not understood, especially if it involves a bunch of humans (economy, sociology...). You could describe it, sure, but that is not the same as understating and modifying at will.
I would acknowledge that. I don't think these things are remotely possible any time soon with current rates of progress.
However, I think people tend to fail to acknowledge the product of exponential trends, so the question in my mind is more whether or not you believe AI will unlock an exponential increase in the rate of progress and understanding. Extremely complex is still finite complexity at the end of the day.
Maybe AI won't significantly increase the rate of progress across all scientific fields. I am fairly confident it will significantly increase the rate of progress over at least some though, and it seems likely to me that biological progresses will be much easier for us to model and predict with AI. I'm much less sure about progress in domains like physics and robotics.
On the slightly optimistic side, much more intelligence will be spent in countering these criminal uses than in enabling them. For each of the terrible inventions you mentioned, there are other inventions to counter them.
Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.
I can think of several possible messy outcomes that would be able to directly affect me, not all mutually exclusive:
- Job loss by me being replaced by an AI or by somebody using an AI. Or by an AI using an AI.
- Resulting societal instability once blue collar jobs get fully automated at scale, and there is no plan in place to replace this loss of peoples' livelihoods.
- People turning to AI models instead of friends for emotional support, loss of human connection.
- Erosion of democracy by making authoritarianism and control very scalable, broad in-detail population surveillance and automated investigation using LLMs that was previously bounded by manpower.
- Autonomous weapons, "Slaughterbots" as in the short film from 2017
- Biorisk through dangerous biological capabilities that enable a smaller team of less skilled terrorists to use a jailbroken LLM to create something dangerous.
- Other powers in the world deciding that this technology is too powerful in the hands of the US, or too dangerous to be built at all and has to be stopped by all means.
- Loss of/Voluntary ceding of control over something much smarter than us. "If Anyone Build It, Everyone Dies"
The only thing preventing this today is cost, not capability. As costs come down over the next 5 years, the idea that the internet was once dominated by people will seem quaint.
Until recently I would have described myself as an AI skeptic. HN has been a great source for cope on the AI subject over the years. You can find nitpicks, caveats, all sorts of reasons to believe things aren’t as significant as they seem. For me Opus 4.5 was the inflection point where I started to think “maybe this isn’t a bubble.” The figures in this report, if accurate, are terrifying.