So does a human engaged in rationalization or confabulation just appear to reason? We might be closer to these machines than you think, and I don’t mean that in a positive way.
Not OP, but as an LLM skeptic, I'd absolutely say that humans are natively very poor reasoners.
With effort, support, and resources, we can learn to reason well from first principles - call it reaching "intellectual maturity."
Catch an emotionally-immature human in a mistake or conflicting set of beliefs, and you'll be able to see them do exactly what you describe above: rationalize, deflect, and twist the data to support a more emotionally-comfortable narrative.
That usually holds even for intellectually-mature individuals who have not yet matured emotionally, even though they may reason quite well when the stakes are low.
Humans that have matured both emotionally and intellectually, however, are often able to keep themselves stable and reason well even in difficult circumstances.
The ways LLMs consistently fail spectacularly on out-of-distribution problems (like these esolangs) do seem to suggest they don't really mature intellectually, not the way humans can.
Maybe the Wiggum loop strategy shows otherwise? I'm not sure I know.
To me, it smells more like brute-forcing through to a result without fully understanding the problem, though.
I don’t have much confidence n the premise. Where was the human control? I think most Python programmers when tasked with “now do it in brainfuck” would fail. There is not much meaningful overlap in how to express intent and solutions to problems. The ridiculous syntax is the joke.
But more importantly, I don’t have to solve any problems with languages that are elaborate practical jokes, so I’m not worried about the implications of an LLMs ability to be useful.
The point here is to test for "genuine reasoning" or something approaching it. If a model is truly reasoning it should be competent even in a new language you just made up (provided the language itself is competently designed)
We do do genuine reasoning. It would take a lot of practice for us to learn it but we also use less "electricity" to do it.
The thing about LLMs is there doesn't seem to be a way to teach them genuine reasoning. You can spend a month teaching an LLM brainfuck and it would likely still fail at a novel problem. Whereas if a human studied brainfuck for a month they would probably be quite competent at a novel problem
No. I’m just an NPC in someone else’s simulation. Wandering the world aimlessly incapable of expressing ideas outside of my training corpus of language. Pathetic.
I would in fact expect any human that's as good at writing code as various state-of-the-art LLMs (if you take the breathless proclamations of their hype bros at face value) to be able to solve the rather simple problems in the benchmark given the relevant esolang spec and some time to figure it out.
It's not as if the models here were asked to write a kernel in Brainfuck; the medium tier of problems here contains such apparently insurmountable tasks as "calculate the nth prime".
> I don’t have to solve any problems with languages that are elaborate practical jokes
This is just being needlessly dismissive. Esolangs are (and have been) an area of active CS research for decades. I know I'm a bit of an esolang nerd, and while some are jokes, most focus on specific paradigms (e.g. Piet is visual, bf is a Turing tarpit, etc.).
> I think most Python programmers when tasked with “now do it in brainfuck” would fail.
This is untrue. Given internet-level awareness and infinite time, virtually all developers should be able to go from Python to brainfuck (trivially, I might add.) Did you even look at the test sets? It's all pretty basic stuff (palindromes, array traversal, etc.—we aren't using pandas here). I mean, sure, it would take forever and be mega annoying, but manipulating a head and some tape is hardly difficult.
Bentley Software is proof that you can ship products with massive, embarrassing defects and never lose a customer. I can’t explain enterprise software procurement, but I can guarantee you product quality is not part of that equation.
Who said we'd assume China will defend Canadian interests? The current strategy is focused on growing much closer to the EU while becoming a trade bridge for Atlantic/Pacific relations. Canada has a lot of clout on the international stage so we've been able to match-make trade linkages while expanding our market.
Canada isn't a first rate power - if the US or China decided to unilaterally target us it'd be deeply damaging. The hope is that working in concert with other middle powers we can form a cushion to soften a blow - not fully turn it.
The company I worked for has a contract with them. My best guess as to why is that shareholders use their power over management to force publicly traded companies to funnel money into the pockets of their mafia friends. I can’t explain what actual business value the platform provides.
I do know that they’re on pretty much all large organizations’ shortlist when they need any type of data intelligence, all of them note even remotely related to the type of intelligence the government has/needs.
And that they’re outrageously expensive as well but somehow still land a lot of these deals.
At some point there will be market consequences for that kind of behavior. So where market dynamics are not dominated by bullshit (politics, friendships forged on Little St James, state intervention, cartel behavior, etc.) if my company provides the same service as another, but I replaced all of the low quality software as a service products my competitor uses with low quality vibe coded products, my overhead cost will be lower and that will give me an advantage.
Oh my goodness, this is deeply untrue. China is facing a massive population implosion. A lot of their global strategy can be understood through that lens: they are racing to accumulate power, standing, and wealth before the implosion starts to kick in.
The West is destroying third world countries to take their human resources. Blowing up the middle east created more uber eats drivers than defunding all the schools in Detroit ever did.
I think you haven't been following the news. US demographics is by far the healthiest of all halfway developed countries, with the exception of Israel. China has third low TFR in the world after South Korea and Taiwan, and is facing extinction within two generations whereas society and the economy will simply cease because all available labor will be consumed by caring hundreds of millions of extremely elderly people.
By 2100, US will look about the same as Western Europe looks today: ageing, but still OK. China will be no more.
This is a completely unfounded conspiracy theory, but I think it’s a fun one. I think Elon Musk is running these companies the same way that he is a top ranked Diablo player. He just plays one on TV. The decision makers in the military industrial complex pushed black programs into a group of private company so they could scale and cut red tape while shedding contractors with really serious performance problems. So now a faction of “the insiders“ control space launches, social media, and have a backup AI company. There are less successful programs like Tesla for getting cattle like me to drive an electric car that can be remotely driven into a median or disabled if someone in Bethesda decides that they don’t like you. Also there is a not so successful attempt to revolutionize tunnel logistics for defense. So what I’m saying is that this is military tech, they just pretend these are private companies run by a Tony Stark showman. I can’t support this with evidence, but it makes for a good story.
Conspiracy theories aren't very productive. But the one thing that continues to bother me is how there is no great explanation for why TSLA is still worth much. It's a shrinking car company that is failing to execute at FSD and says it's going to make humanoid robots instead of cars.
There is no good reason TSLA should be valued any more than 10% of its current valuation, and even that would be rich. There is a fine argument it should be worth 3-4% of what it currently is.
It is almost like there's a connection between PayPal, Elon Musks fortunes, and crypto.
I still wonder who Satoshi really was. I wonder how Microstrategy remains solvent.
The vision for the future elon gives us (exploring the stars, human augmentation, advanced AI likely leading to elimination of suffering) is a heaven-like vision in a western world where most people don’t believe in anything much, and many of our leaders and intellectuals are misanthropes who think having kids is selfish.
I don’t care what tesla’s quarterly sales are, I’m supporting elon’s vision.
That vision is a lie, and it's a distraction. It is taking advantage of the emptiness that they themselves created, and now they are making you angry to distract you while they rob you. I sincerely wish you well in life, don't pick the wrong heroes.
There are many such mysteries, right? How does Oracle make money when every product of theirs sucks and is worse than free alternatives? How is it that Google and Meta seem to have more revenue from “advertising“ than everyone spends on advertising? Where are the product sales that can be traced to this massive amount of spending? I don’t think you could even articulate a plausible business plan around what Google claims to do, especially when they were hot in the early 2000s. How do large financial institutions, like JP Morgan, get fined for financial crimes yet still operate with total public trust? Just as strange as Bigfoot and aliens but in plain sight.
Again, I’m going to qualify this with the disclaimer that this is my own baseless conspiracy theory presented purely for its entertainment value. I suspect that the United States has many effectively state owned enterprises just like the PRC, but there are elaborate obfuscation techniques used to make that seem as if that were not the case. In part that is because a large criminal network is wearing the dead US government like a skin suit.
Whomever it is, was, there are a handful of individuals still holding block controls on the ORIGINAL chain... that could topple ANY valuation. Those who sold around $0.32/USD would be happy to know that chasing the dragon would have made them as mad as the leads on TV shows.
I think the notion of a cryptocurrency treasury company is idiotic but Strategy (MicroStrategy) is an audited public company. If you want to know how they're solvent then you can literally just read their financial statements.
reply