More

MadxX79 · 2026-03-14T07:22:15 1773472935

Your developers were so preoccupied with whether or not they could, they didn't stop to think if they should (add 250kloc)

bot403 · 2026-03-14T08:18:44 1773476324

I've worked with a type of (anti?) developer in my career that seems to only be able to add code. Never change or take it away. It's bizarre. There's some calculation bug, then a few lines down some code which corrects it instead of just fixing the original lines.

It's bizzare, and as horrible as you might imagine.

And it's been more than one or two people I've seen do this.

MadxX79 · 2026-03-14T08:35:40 1773477340

Now they have agents.

People need to understand that code is a liability. LLMs hasn't changed that at all. You LLM will get every bit as confused when you have a bug somewhere in the backend and you then work around it with another line of code in the front end. line of code

GoblinSlayer · 2026-03-14T10:48:50 1773485330

They can always generate a new backend prototype from scratch.

pram · 2026-03-14T10:02:50 1773482570

This sounds like some kind of learned risk aversion, like they don’t want to assume the responsibility of altering whats already there.

MadxX79 · 2026-03-11T11:26:43 1773228403

I get what you're saying, but I remember watching teletubbies back in the days with my nephew, and all questions of the form:

Have ____ surpassed teletubbies?

Can always be answered in the affirmative.

MadxX79 · 2026-03-08T19:32:03 1772998323

I'm guessing they have a lot of shares in the AI companies they work(ed) for, and they would like to pump their value so they can buy an even nicer carribean island than they can already afford?

dwohnitmok · 2026-03-08T19:40:24 1772998824

Kokotajlo gave up all his shares in OpenAI as part of his refusal to sign a nondisparagement agreement with OpenAI.

stratos123 · 2026-03-08T19:51:08 1772999468

Kokotajlo in particular is notable for being the guy who quit OpenAI in 2024 in protest of their policy of requiring researchers to abide by a non-disparagement agreement to retain their equity. In the end OpenAI caved and changed their policy, but if he was lying all along to inflate the value of his shares, it would have been quite a 4d chess move of him to gamble the shares themselves on doing so.

MadxX79 · 2026-03-08T20:20:49 1773001249

Isn't it just that he left way before gpt-5, then? At that point a sufficiently naive person could have believed that scaling was going to lead to AGI, but that sort of optimism died after he was already an outsider.

dwohnitmok · 2026-03-09T00:04:33 1773014673

Kokotajlo still believes we get AGI in the next few years. These are his most updated numbers at the moment: https://www.aifuturesmodel.com/

MadxX79 · 2026-03-09T12:57:06 1773061026

I love the total lack of humility on that site. "What if the METR study turns out not to capture anything relevant? We just add a constant gap to be conservative!". But I guess these guys aren't really scientist, so it's probably a lot to ask that they relate critically to what they are doing and be honest about the limitations of their methods.

What if it turns out that the more you scale the more your LLM resembles a lobotomized human. It looks like it goes really well in the beginning, but you are just never going to get to Einstein. How does that affect everything?

What if it turned out that those AI companies were maybe having a whole bunch of humans solving the problems that are currently just below the 50% reliability threshold they set, and do fine tuning with those solutions. That will make their models perform better on the benchmark, but it's just training for the test... will the constant gap be a good approximation then?

dwohnitmok · 2026-03-09T00:01:46 1773014506

Not quite.

Kokotajlo quit because he didn't think OpenAI would be good stewards of AGI (non-disparagement wasn't in the picture yet). As part of his exit OpenAI asked him to sign a non-disparagement as a condition of keeping his equity. He refused and gave up his equity.

To the best of my knowledge he lost that equity permanently and no longer has any stake in OpenAI (even if this episode later led to an outcry against OpenAI causing them to remove the non-disparagement agreement from future exits).

MadxX79 · 2026-03-08T19:28:56 1772998136

I enjoyed playing mastermind with LLMs where they pick the code and I have to guess it.

It's not aware that it doesn't know what the code is (it isn't in the context because it's supposed to be secret), but it just keeps giving clues. Initially it works, because most clues are possible in the beginning, but very quickly it starts to give inconsistent clues and eventually has to give up.

At no point does it "realise" that it doesn't even know what the secret code is itself. It makes it very clear that the AI isn't playing mastermind with you, it's trying to predict what a mastermind player in it's training set would say, and that doesn't include "wait a second, I'm an AI, I don't know the secret code because I didn't really pick one!" so it just merilly goes on predicting tokens, without any sort of awareness what it's saying or what it is.

It works if you allow it to output the code so it's in context, but probably just because there is enough data in the training set to match two 4 letter strings and know how many of them matches (there's not that many possibilities).

Balinares · 2026-03-08T20:19:03 1773001143

That is actually a genius and beautifully simple way to exhibit the difference between thought and the appearance of thought.

MadxX79 · 2026-03-08T21:34:35 1773005675

It really dispelled the illusion for me, but it's not that easy to find those examples, but the combinatorics of possible number of guesses is untractable enough that it can't learn a good set of clues for all possible guesses.

MadxX79 · 2026-03-08T15:50:05 1772985005

Can it store my PIN numbers and my map of ATM machines also?

wolfi1 · 2026-03-08T16:15:45 1772986545

was about to point that out, you beat me to it

MadxX79 · 2026-03-08T16:29:49 1772987389

I rushed so much that I didn't have time to do it right. It could have been the AST tree of my PIN number validation algorithm for ATM machines. :-P

danparsonson · 2026-03-08T23:45:25 1773013525

I don't think your original post suffered for the lack of one more TLA acronym.

pseudohadamard · 2026-03-09T05:07:05 1773032825

Right! I had to get up in the morning, at ten o'clock at night, half an hour before I went to bed... sorry, wrong sketch. I had to set up my PIN number to display on the LCD display of an ATM machine with the instructions printed in PDF format telling me how to add VAT tax all before midday GMT time.

And you try and tell the young people of today about RAS syndrome, they won't believe you!

MadxX79 · 2026-02-28T18:52:30 1772304750

What's your argument here? He's not allowed to discuss crony capitalism because you imagine that he thinks LLMs suddenly became reliable.

AndrewKemendo · 2026-02-28T18:59:10 1772305150

It’s a comment about who Gary Marcus is presenting himself as

My intention is for other people to think what I believe which is Gary Marcus is a hack and has no business being listened to with respect to technical evaluation of AI because he’s not technically competent enough to do. The existence of his polemics waste everybody’s time and generally waste resources like we’re wasting right now.

His entire schtick has been as the debunker in chief of claims of AI capabilities

If you actually look at his polemics they increasingly have nothing to do with his original argument because his original argument not only is flawed but is ignorant of the technical capabilities

nickthegreek · 2026-02-28T20:13:02 1772309582

Then disassemble the argument the author is making and show people an alternative reality based take if you want to be taken seriously.

MadxX79 · 2026-03-08T16:42:30 1772988150

The guy founded two AI companies and exited one, is a researcher who is an expert on how actual human babies learns things and who has done research comparing how LLMs learn and represent language compared to humans. How much more expertise does he need to have an opinion?

mold_aid · 2026-03-01T12:04:39 1772366679

What "technical competence" do you need to provide a technical evaluation, o person who is posting a comment to the "here's how I vibe-coded a fantasy football analytics service" forum? You think everyone here has a technically deep background?

MadxX79 · 2026-02-28T14:21:59 1772288519

Brook's law anno 2026:

"Adding manpower to a late software project makes it later -- unless that manpower is AI, then you're golden!"

steveklabnik · 2026-02-28T19:56:49 1772308609

I know you're being sarcastic, but this is what OpenAI has said:

https://openai.com/index/harness-engineering/

> This translates to an average throughput of 3.5 PRs per engineer per day, and surprisingly the throughput has increased as the team has grown to now seven engineers.

We will see if this continues to scale up!

smikhanov · 2026-02-28T14:26:25 1772288785

That law (formulated in the 70s, I’ll remind the reader) wasn’t true for at least couple decades now.

medi8r · 2026-02-28T14:29:23 1772288963

Why not? What changed? It seems like a human factors thing. New people have to get up to speed. Doers become trainers.

smikhanov · 2026-02-28T14:37:26 1772289446

Several related reasons working at once. The nature of work changed. The boundary between accidental and incidental complexity shifted (and it’s unclear whether this distinction still exists). Niche specializations within the field emerged. The way to structure and decompose projects changed dramatically (agile and stuff).

One pathological example: if you’re running a server-based product, quite often what stands between you and a new feature launch is literally couple of thousands of lines of Kubernetes YAML. Would adding someone who’s proficient in Kubernetes slow you down? Of course not.

One may say, hey, this is just the server-side Kubernetes-based development being insane, and I’ll say, the whole modern business of software development is like this.

medi8r · 2026-02-28T14:48:50 1772290130

Hmm interesting, thanks! I was ready to argue but now I have to think, which is even better.

smikhanov · 2026-02-28T15:13:03 1772291583

That’s a lovely comment, thank you. If you’re keen to think about it more, consider the fact that the existing members of the project that’s being late are actually in not as much of an advantage compared to the new joiners, as it’s common to think.

Yes, they know how the feature they work on relates to other features, but actually implementing that feature is very often mostly involves fighting with technology, wrangling the entire stack into the shape you need.

In Brooks’s times the stack was paper-thin, almost nonexistent. In modern times it’s not, and adding someone who knows the technology, but doesn’t have the domain knowledge related to your feature still helps you. It doesn’t slow you down.

One may argue that I’m again pointing to the difference between accidental and incidental complexity, and my argument is essentially “accidental complexity takes over”, but accidental complexity actually does influence your feature too, by defining what’s possible and what’s not.

Some good thoughts (not mine) on the modern boundary between accidental and incidental complexity: https://danluu.com/essential-complexity/

dasil003 · 2026-02-28T16:35:29 1772296529

I sort of agree that the surface area and incidental complexity of stacks give more space to plug more developers in than was true in the 70s and 80s. But I disagree strongly this invalidates Brooks Law. Certainly there are cases where adding people helps, especially if they are stronger engineers than the ones that are already there, but I’ve also seen way too many projects devolve into resourcing conversations when the real problem was over-complicated, poorly reasoned requirements, boil-the-ocean solutions promising a perfect end state without a clear plan to get there iteratively.

ldng · 2026-02-28T17:17:15 1772299035

Plus, the "since there are more resources, let's add features" effect.

MadxX79 · 2026-02-25T11:19:16 1772018356

Is that the cost per token or the actual cost of the user having a conversation, reasoning and all?

simianwords · 2026-02-25T11:29:50 1772018990

Cost per defined capability. Meaning you fix the task and then find how much it cost to achieve it including reasoning, tokens etc.

MadxX79 · 2026-02-26T14:33:14 1772116394

source?

simianwords · 2026-02-26T17:33:14 1772127194

https://epoch.ai/data-insights/llm-inference-price-trends

MadxX79 · 2026-02-26T18:48:55 1772131735

From your link:

"We excluded reasoning models from our analysis of per-token prices. Reasoning models tend to generate a much larger number of tokens than other models, making these models cost more in total to evaluate on a benchmark. This makes it misleading to compare reasoning models to other models on price per token, at a given performance level."

It's just price per token. Token usage is exploding.

simianwords · 2026-02-26T20:09:20 1772136560

Token usage is increasing but the link I shared is comparing price per token which is the reason they are excluding reasoning models.

Reasoning models have also gotten cheaper.

Generally cost to “achieve a certain task” using whatever model, even reasoning has drastically reduced.

The best example is arc agi. https://arcprize.org/leaderboard

This measures cost to achieve certain percentage of score. Fix a certain accuracy and see how much price reduces over time. It’s more than 100x.

MadxX79 · 2026-02-25T11:08:38 1772017718

If you don't give me a discount on my salesforce subscription I'll shoot myself in the face with this AI enabled gun?

square_usual · 2026-02-25T15:55:30 1772034930

You don't need AI to shoot yourself in the face; salesforce can do that just fine.

MadxX79 · 2026-02-24T10:32:21 1771929141

Absolutely, the belief in scientific circles is that the way forward to develop cures (or at least treatment that slows down the progression) is to treat it early. When you get to the point where you start showing clear symptoms, your brain is already mush. If you have a potential treatment that attacks the root cause, you would have to catch the very early, pre-clinical, stages of the disease, but without good diagnostics there is no way to do that (short of giving the disease to a wide swath of the population, like a vaccine... but that gets expensive very quickly, and side effects become a bigger worry.