Hacker Newsnew | past | comments | ask | show | jobs | submit | hyko's commentslogin

The fatal problem with LLM-as-runtime-club isn’t performance. It’s ops (especially security).

When the god rectangle fails, there is literally nobody on earth who can even diagnose the problem, let alone fix it. Reasoning about the system is effectively impossible. And the vulnerability of the system is almost limitless, since it’s possible to coax LLMs into approximations of anything you like: from an admin dashboard to a sentient potato.

“zero UI consistency” is probably the least of your worries, but object permanence is kind of fundamental to how humans perceive the world. Being able to maintain that illusion is table stakes.

Despite all that, it’s a fun experiment.


> The fatal problem with LLM-as-runtime-club isn’t performance. It’s ops (especially security).

For me it is predictability. I am a big proponent of AI tools. But even the biggest proponents admit that LLMs are non-deterministic. When you ask a question, you are not entirely sure what kind of answers you will get.

This behavior is acceptable as a developer assistance tool, when a human is in the loop to review and the end goal is to write deterministic code.


Non-deterministic behaviour doesn’t help when trying to reason about the system. But you could in theory eliminate the non-determinism for a given input, and yet still be stuck with something unpredictable, in the sense that you can’t predict what new input will cause.

Whereas that sort of evaluation is trivial with code (even if at times program execution is non-deterministic), because its mechanics are explainable. Things like only testing boundary conditions hinge on this property, but completely fall apart if it’s all probabilistic.

Maybe explainable AI can help here, but to be honest I have no idea what the state of the art is for that.


At this extreme, I think we'd end up relying on backup snapshots. Faulty outcomes are not debugged. They, and the ecosystem that produced them, are just erased. The ecosystem is then returned to its previous state.

Kind of like saving a game before taking on a boss. If things go haywire, just reload. Or maybe like cooking? If something went catastrophically wrong, just throw it out and start from the beginning (with the same tools!)

And I think the only way to even halfway mitigate the vulnerability concern is to identify that this hypothetical system can only serve a single user. Exactly 1 intent. Totally partitioned/sharded/isolated.


Backup snapshots of what though? The defects aren’t being introduced through code changes, they are inherent in the model and its tooling. If you’re using general models, there’s very little you can do beyond prompt engineering (which won’t be able to fix all the bugs).

If you were using your own model you could maybe try to retrain/finetune the issues away given a new dataset and different techniques? But at that point you’re just transmuting a difficult problem into a damn near impossible one?

LLMs can be miraculous and inappropriate at the same time. They are not the terminal technology for all computation.


What if they are extremely narrow and targeted LLMs running locally on the endpoint system itself (llamafile or whatever)? Would that make this concern at least a little better?


Downvoted! What a dumb comment right?


I use em dashes—chiefly to express parenthetical thoughts—all the time. Sadly, there’s no foolproof system for identifying machine-generated text. Happily, it means one less thing to worry about.


Is the video captured using the dev kit?


It’s a reasonable hypothesis, but you’d need to experiment to validate it. It’s easy to imagine how prolonged exposure to bed rest or microgravity may trigger heart muscle remodelling in a way that intermittent bouts of lying down would not.


Leader/follower is only inclusive language if you’ve never heard of anarchism.


The really great thing about electric cars is that people are actually adopting them, in part because they are better than the machines they are replacing; therefore their environmental benefits can actually be realised.

If you think you have a real solution apart from the fact that nobody wants to adopt it, then what you actually have is a fantasy. It’s far easier to imagine a paradise than to construct one.


You’re probably already aware, but there are medical formulations of CA that are in theory less likely to irritate tissues than general purpose ones.


Yes, but I haven't managed to find any where I live... Do you know of any off the top of your head?


I would caution against using “Qui bono” as a heuristic; you will end up believing in a lot of BS using that as a tool. It can only ever be a piece of evidence alongside others. I benefit immensely from sunlight, but that doesn’t mean I am implicated in the rotation of the earth.

In this particular case, there are many natural experiments we can conceive of that demonstrate why fasting alone is highly unlikely to significantly impede “normal” aging in humans in the lab. That’s before we get to how effective it could actually be as a treatment.

And while we’re following the money, there are many powerful entities that would love a “free” aging cure to juice their demographics…so why don’t they use it?


> It can only ever be a piece of evidence alongside others.

A heuristic is not evidence, as you should know.


Yeah exactly, I’m saying it’s dangerous to use “who benefits?” as a heuristic, because it will be wrong far more often than it will be right.


> Yeah exactly,

I say that it is a heuristic, you conflate it with evidence, and then you say that’s exactly the point (that it’s not evidence)? You’re not saying anything that I haven’t said originally.

Your supposed counter-example is beyond facile. Who-benefits presupposes that the agent has the means to affect some outcome. But no human could have arranged the rotation of the Earth.


Buying an ejector seat for your living room is quite eccentric, but I wouldn’t really consider it overkill. It doesn’t even have arms.


For my office at the company I work at it would be perfect.


Is your “office” an F-16 by any chance?


No it is not. But sometimes I do wish to leave very quickly.


No it can’t.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: