Hacker Newsnew | past | comments | ask | show | jobs | submit | karel-3d's commentslogin

But Tesla and SpaceX are different companies

To his credit he also delivers, sometimes.

X kind of works. XAi kind of works. You can say it is all kind of broken but it works. People predicted X will collapse just a few months ago!

StarLink is really popular now, and it didn't exist few months ago.

He can still do things. People are betting on that.

Now if you ask me, Tesla is still his biggest moneymaker and collapse of Tesla sales will be catastrophic for his empire.


> X kind of works.

It is less popular and makes less money than when he acquired it, and that is ignoring the fact that it is a cesspool of racism now.


It was very rough the first few years. It's fine now.

Gas Town people should get together with the Urbit people.

Together they would be unstoppable.


I opened this and I saw some random ramblings about Bitcoin that look suspiciously LLM-generated.

Not for me.


(these are not my stories; I am just reusing the original title)

It's a joke. I think.

When I was working before on something that used headless browser agents, the ability to do a screenshot (or even a recording) was really great for debugging... so I am not sure about the "no paint". But hey everything in life is a trade-off.


Really depends on what you want to do with the agents. Just yesterday I was looking for something like this for our web access MCP server[0]. The only thing that it needs to do is visit a website and get the content (with JS support, as it's expected that most pages today use JS), and then convert that to e.g. Markdown.

I'm not too happy with the fact that Chrome is one of our memory-hungriest parts of all the MCP servers we have in use. The only thing that exceeds that in our whole stack is the Clickhouse shard, which comes with Langfuse. Especially if you are looking to build a "deep research" feature that may access a few hundreds of webpages in a short timeframe, having a lightweight alternative like Lightpanda can make quite the difference.

[0]: https://github.com/EratoLab/web-access-mcp


Well, it was "normal" crawlers that needed to work perfectly and deterministically (as best as possible), not probabilistically (AI); speed was no issue. And I wanted to debug when something went wrong. So yeah for me it was crucial to be able to record/screenshot.

So yeah, everything is a trade-off, and we needed a different trade-off; we actually decided to not use headless chromium, because they are slight differences, so we ended up using full chrome (not even chromium, again - slight differences) with xvfb. It was very, very memory hungry; but again was not an issue

(I used "agent" as in "browser agent", not "AI agent", I should be more precise I guess.)


yeah I feel the same, I think even having a screenshot of part of rendered page or full page can be useful even for machines considering how heavy those HTML can be to parse and expensive for LLM context. Sometimes (sub)screenshot is just a better kind of compression


Yes HTML is too heavy and too expensive for LLM. We are working on a text-based format more suitable for AI.


What do you think of the DeepSeek OCR approach where they say that vision tokens might better compress a document than its pure text representation?

https://news.ycombinator.com/item?id=45640594

I've spent some time feeding llm with scrapped web pages and I've found that retaining some style information (text size, visibility, decoration image content) is non trivial.


Keeping some kind of style information is definitely important to understand the semantics of the webpage.


Wow this looks like a stricter, more sane Markdown! Great, will try it... sometimes.


markdown is horrible, horrible format to parse; there are so many ambiguities; CommonMark is so complex because of that and still has so many ambiguities.

it's like YAML: it looks so simple at first, and then the horrors start if you try to use it seriously.

in both cases the most horrors lie in the spaces/tabs/newlines.


> markdown is horrible, horrible format to parse...

I agree entirely. But it's a lovely format to use. Programming as a profession is entirely about making things easier for our users, even if it means making things harder for ourselves.

After all, that's the whole ethos around the web as a platform. Throw some broken HTML soup at a browser and it'll still try its best to render it.


That is true, modern HTML is also (from what I heard!) hard to parse.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: