According to their own blog post, even after mitigations, the model still has an 11% attack success rate. There's still no way I would feel comfortable giving this access to my main browser. I'm glad they're sticking to a very limited rollout for now. (Sidenote, why is this page so broken? Almost everything is hidden.)
The strong sense I got from reading this is that they don't believe it's possible to safely do this sort of thing right now, and they want to warn people away from Perplexity etc. so they can avoid losing market share while also not launching a not-yet-ready product.
(The more interesting question will be whether they have any means to eventually make it safe. I'm pretty skeptical about it in the near term.)
> The strong sense I got from reading this is that they don't believe it's possible to safely do this sort of thing right now, and they want to warn people away ...
This is directly contradicted by one of the first sentences in the article:
We've spent recent months connecting Claude to your
calendar, documents, and many other pieces of software. The
next logical step is letting Claude work directly in your
browser.
Ascribing altruism to the quoted intent is dissembling at best.
well, at least they are honest about it and don't try to hide it in any way.
They probably want to gather more real world data for training and validation, that's why this limited release.
openai have browser agent for some time already but I didn't hear about any security considerations. I bet they have the same issues
> at least they are honest about it and don't try to hide it in any way.
Seems more likely they’re trying to cover their own ass, so when anything inevitably goes wrong they can point and say “see, we told you it was dangerous, not our fault”.
I'm honestly dumbfounded this made it off the cutting room floor. A 1 in 9 chance for a given attack to succeed? And that's just the tests they came up with! You couldn't pay me to use it, which is good, because I doubt my account would keep that money in it for long.
> According to their own blog post, even after mitigations, the model still has an 11% attack success rate.
That is really bad. Even after all those mitigations imagine the other AI browsers being at their worst. Perplexity's Comet showed how a simple summarization can lead to your account being hijacked.
> (Sidenote, why is this page so broken? Almost everything is hidden.)
They vibe-coded the site with Claude and didn't test it before deploying. That is quite a botched amateur launch for engineers to do at Anthropic.
11% success rate for what is effectively a spear-phishing attempt isn't that terrible and tbh it'll be easier to train Claude not to get tricked than it is to train eg my parents.
What ! 1 in 10 successfully phished is ok ? 1 in 10 page views. That has to approach 100% success rate over a week say month of browsing the web with targeted ads and/or link farms to get the page click
One in ten cases that take hours on a phone talking to a person with detailed background info and spoofed things is one issue. One in ten people that see a random message on social media is another.
Like 1 in 10 traders on the street might try and overcharge me is different from 1 in 10 pngs I see can drain my account.
The kind of attack vector is irrelevant here, what's important is the attack surface. Not to mention this is a tool facilitating the attack, with little to no direct interaction with the user in some cases. Just because spear-phishing is old and boring doesn't mean it cannot have real consequences.
(Even if we agree with the premise that this is just "spear-phishing", which honestly a semantics argument that is irrelevant to the more pertinent question of how important it is to prevent this attack vector)
>Claude not to get tricked than it is to train eg my parents.
One would think but apparently from this blog post it is still succeptible to the same old prompt injections that have always been around. So I'm thinking it is not very easy to train Claude like this at all. Meanwhile with parents you could probably eliminate an entire security vector outright if you merely told them "bank at the local branch," or "call the number on the card for the bank don't try and look it up."
Are these old computers viable to use daily? Is there any advantage over using an emulator on more modern hardware? (Obviously not the point of this project.)
If you used them when they were current, the emulator experience is never quite the same. The input latency is always detectably worse, especially without a CRT (and even now you're no longer 15-25 years old), and there's always at least a bit of sound latency. Also, you're using a modern keyboard and mouse.
On the flip side, all the original hardware is now ancient and at least somewhat broken (or going that way), and it's a pain to keep it running as an ongoing prospect. CRTs, floppy disk drives, floppy disks, hard disk drives, key switches, mice with balls, aging capacitors, batteries, little plastic bits inside the keyboard that you didn't even realise were there until they crumbled into dust - they all go bad in the long run, and the repair always eats up at least a bit of time. (Even assuming it's actually repairable! Battery damage can be literally unfixable. Parts supply generally can be an issue. Mouldy floppy disks are time-consuming to rescue, and can damage the drives as you attempt it. Those little plastic keyboard bits are theoretically 3d printable, but you'll need to figure out what shape they were originally and how to glue them into place. And so on.)
The long-term prognosis for modern computers is uncertain too - but the nice thing about them is that you can always just buy another one. Turns out they're always making more of them!
> On the flip side, all the original hardware is now ancient and at least somewhat broken (or going that way), and it's a pain to keep it running as an ongoing prospect
Fortunately there are FPGA implementations, though you might want a non-USB gamepad and keyboard, and a CRT (or maybe a 120Hz or better HDMI display?) to get closest to the original performance.
Assuming they’re not doing any kind of fancy processing and are just pumping data straight to pixels, shouldn’t some OLED displays now be capable of latency close to that of CRTs?
That’s not the whole picture. I have a “mini Mac” I built that runs BasiliskII directly on a Raspberry Pi 3’s framebuffer (using SDL on a directly attached LCD) and the thing is _much_ faster and snappier than the SE/30 it sort of looks like.
Were it not for the size (it’s 1/3 scale, so the screen is tiny), it would be pretty “usable” with Word and Excel.
Yup. I've got quite the collection of old computers, including all that were mine in the past, dating back to my Atari 600 XL, Commodore 64 (several of these), Commodore 128, Commodore Amiga 500 and then a few others I collected throughout the ages: a cool Texas Instrument Ti/99-4a (had one for a few days in the past so I had to get one), a Macintosh Classic (as in TFA), the little Atari Portfolio that young John Connor uses in Terminator 2 to hack doors (I had to have one), etc.
But these are complicated to keep working, especially when you know nothing about electronics.
As the years are passing by, fewer and fewer of these are still working (yup, I did remove the batteries when applicable). And they don't bring much, if anything, compared to a modern one.
My most prized possession is however a vintage arcade cab, complete with its CRT screen and both original (and bootleg) vintage PCBs and a Raspberry Pi with a Pi2JAMMA (an arcade cab standard) adapter and thousands of arcade games on MAME.
There's something about an actual arcade cab with a CRT and proper joysticks that a modern PC with a 4090 GPU cannot reproduce. Say playing Robotron 2084! with two 8-directions joysticks (one in each hand): that's simply not an experience you get on anything else but a proper full-sized arcade cab.
Even kids, who have no nostalgic appeal to vintage arcade cabs, are drawn to that thing.
That cab I plan to keep working for a very long time. But all my 8 bit and 16 bit computers? I'm not so sure.
Infinite Mac (https://infinitemac.org) is honestly incredible and gets you 99% of the way there for running old software for the nostalgia.
But there's definitely something fun about running the old hardware with an old spinning hard drive, clacking away while it boots up for 2-3 minutes.
And then launching Microsoft Word 5.1 and wondering if it locked up, while each toolbar loads in one by one!
Honestly though, if you just wanted to do word processing, it's fine for that, and with modern tools like FloppyEmu, BlueSCSI, and some of the networking hacks with modern cheap hardware, you can get one of these things to transfer files to and from a network share very easily.
The (lack of) latency is probably the most difficult part to reproduce not just emulation, but with a modern hardware+software stack period. It’s not necessary to go back as far as the Mac Classic to get that though, anything that can boot Mac OS 9 (including a few that can hacked to run it, like the G4 Mini) will get you that too. When I boot up my PowerBook G3 the sheer responsiveness when typing immediately stands out.
People bought them to do real work when they were new. I can't see why they can't continue to do that as long as you don't want to connect it to the internet.
I have a couple small computers built with modern (micro controller) components and software but with similarly constrained environments. The point is to not have access to most things your full modern PC does (the modern web, games, youtube etc) so you can focus on creative tasks.
Between these kinds of optimizations, improved data center efficiency, and smaller models being more capable, I wonder how long it will be before someone manages to make a profitable AI business. Maybe when they race to train better models slows down and they don't need to constantly upgrade capacity.
Reminds me of the early days of cloud computing. It was very pricey, but once the tools caught up in 5 or so years, it went from "omg cloud is so expensive" to "omg cloud is only expensive when its worth building your own data center"
Does anyone even use share buttons? I always just copy the link, and it seems that anyone I see sharing things does the same. It feels more like a way for the social media companies to advertise/track, and those sites have been sending less and less traffic for years, so I wonder why every site still has them.
They also track your friends and acquaintances through shares. Even if you don't give them access to contacts, and don't even use TikTok beyond scrolling through videos, they will start suggesting people who have seen your shares as your friends.
Yeah I don't know if I've ever used them. Maybe once or twice ever. For me the main problem is UX. Sharing from my platform (copy url on computer, share button on android) is always in the same place. But the share button built in is never in the same place across different websites, how could it be?
They do. I built a file sharing page for friends and most were confused how to share me a link to a file. I implemented the copy url button they asked for and no complaints, even though the url is always visible on every browser they use
I would guess it's because of regulatory compliance. You really don't want to release slots with a major payout but, and if you do, you want to be able to throw the blame at Unity.
Uhm, no? Unity is just the UI. The RNG algorithm runs on a centralised server.
The real problem is that the gambling business model is inherently incompatible with revenue sharing, because unlike a video game, you're paying money back to your customers every time they get a small win.
Let's say a gambler is wasting away $500 and getting back 95% every day and he plays until he runs out of money. Then unity would be getting $400 and the casino would be getting $100.
"49,000+" makes this the least responses the survey has gotten since 2016 ("over fifty thousand"), every year in between has been in the 65-100k range. Seems as though enthusiasm around SO has diminished significantly over the past year.
My guess is that pro-AI devs have abandoned the site, and anti-AI devs are upset with their collaboration with AI companies.
I feel like the main corpus for GHCP, at least, is probably just GitHub.
SO is filled with all sorts of questions from 10-15 years ago that aren't up-to-date with the languages and tooling of today. If you develop in a language that is substantially different from what it used to be (JavaScript, Python, etc.) that is problematic.
As others have said this before, AI is a tool and it is not supposed to nor should replace thinking. Unfortunately many people does not abide by this. See the other submission where people made a 128k line PR to a project: https://news.ycombinator.com/item?id=44729461.
It is clear that the guy responsible for this PR outsourced thinking. I would call it "misuse of LLMs". LLMs get a bad rep because of individuals who completely outsource thinking.
I think people are abandoning SO because AI often gives better answers. And also SO search results on Google just got so much worse a few years ago, you search Google for "how to do X in framework Y" and get SO results about doing X in framework Z.
And in place upgrades! It was a massive problem for years with Zorin (and still exists with other "user friendly" distros like Elementary), requiring a full system reinstall every time a new version released.
That being said, I still think this is a bit of a strange option when there's several Ubuntu flavors with more Windows-esque desktops, plus Linux Mint which offers a lot of these benefits with a much larger userbase and therefore better support (though Zorin is more "modern" looking). Not a bad option but not one I'd think to recommend often.
Bluesky's apps have the verification, but everything else using the protocol can just not implement it.