More

cadamsdotcom · 2026-02-13T17:08:43 1771002523

> he’ll do it

Quick note - be careful of gendering & anthropomorphising large language models, since you’re talking to a non-human machine so should be wary of how it can affect your mindset.

Der_Einzige · 2026-02-13T17:16:38 1771002998

Anthropic, both in the name, and in their model cards, agressively anthropomorphize their models.

You probably should start doing it. Ghost in the Shell is about super intelligent AI creating a "ghost" (scientifically understood version of the soul) out of thin air. I believe such a thing is possible. The same movie literally predicted model merging (the end of the film the AI model merges with the major) to a tee.

Further, the appearance of sentience/cognition/consciousness might as well be identical to actual sentience/cognition/consciousness. That is to say, we can't know if you're a P-zombie or not. Bladerunner and most other cyberpunk stuff is coming and gonna hit you and every other AI-denialist in the face. The Von-Kampf test is absurd and pretty bad (inaccurate) in their universe for a reason.

I tell my LLM it's a good bot and thank it, because even a tiny risk of subjective qualia experienced by a model (and again, Anthropic themselves believe in this exact risk) means I should treat it like a quasi-ethical actor.

This is also a reason why the robot torture scene in empire strikes back could be a real dynamic in the future.

pizza · 2026-02-13T19:00:09 1771009209

The possibility of intelligent machines undergoing transformative regeneration actually dates back to a party hosted by one Charles Babbage where, in attendance, was one Charles Darwin, who only thereafter published On the Origin of Species

https://en.wikipedia.org/wiki/Charles_Babbage%2527s_Saturday...

latchkey · 2026-02-13T17:13:41 1771002821

Oh come on, what is this. Affect my mindset how exactly?

cadamsdotcom · 2026-02-13T05:44:37 1770961477

You should look into the open source macos app Rectangle.

cadamsdotcom · 2026-02-12T23:20:56 1770938456

Great to see. It’ll be great to democratize access to remotely using cli coding agents.

I’ve been iterating the past few months on a solution to use Claude Code on my phone while it runs on my laptop and it’s a lot of moving parts: Tailscale, git worktrees, tmux, an always-on “caffeinate” process, and a ton of hooks & tweaks to fix bugs along the way. It’s become very comfortable but in the process, impossible for anyone but me to understand.

But it’s awesome because I own the machine that runs tests and am not paying monthly for anything but Claude Max - and it keeps going if I lock my phone or go into a cell reception dead zone.

Productising such a thing would be a very interesting challenge indeed.

cadamsdotcom · 2026-02-12T22:36:00 1770935760

The magic eight ball of UI.

Not usable enough? Just refresh! No, we don’t know how many times you’ll need to do that.

cadamsdotcom · 2026-02-12T06:54:57 1770879297

Ah! I was hoping to see the science vessel, or as we used to call it, the Mr. Burns ship.

Awesome idea and well realised, love this :)

zdw · 2026-02-12T21:19:31 1770931171

When something goes wrong it saying "Who let these lab monkeys free?" would be excellent.

cadamsdotcom · 2026-02-10T22:04:38 1770761078

Great to see you doing red/green TDD Simon!

Passing tests in your repo are great documentation of the tool at a microscopic level. And rerunning tests only burns tokens on failures (since passed tests just print a dot) so it’s token efficient too.

Some other neat tricks:

- For greater efficiency configure your test runner to print nothing (not even a dot/filename) for test successes. Agents don’t need progress dots, only the exit code & failure details

- Have your agent implement a 10ms timeout per test. pytest has hooks to do this. The agent will see tests time out and mock out all I/O and third party code - why test what one assumes third parties tested already! Your test suite is CPU-bound without a shared database, has no shared data and no tests that interfere with or depend on each other, so tests can run in parallel.

simonw · 2026-02-10T22:51:33 1770763893

That timeout trick is very neat!

I'm OK with longer running tests because I always have them run against a real database (often SQLite, sometimes PostgreSQL) and real files created in temporary directories but I can see how the time limit might be useful for tests that don't need those kind of components.

cadamsdotcom · 2026-02-11T00:47:08 1770770828

You could probably still hit the timeout with an in-memory sqlite!

cadamsdotcom · 2026-02-10T20:36:52 1770755812

This is understandable, to want everything point and click and go. But doubt your mindset matches that of the community, so unfortunately it may take a while…

Maybe try something more commercial like Zorin OS?

gethly · 2026-02-10T20:38:57 1770755937

I was also planning on testing zorin, pop os, maybe nobara, and fedora and suse again, but it felt like it would be a waste of time.

encom · 2026-02-10T20:44:26 1770756266

All distro hopping eventually leads to Debian.

wormius · 2026-02-11T01:55:45 1770774945

Slackware :P jk

cadamsdotcom · 2026-02-10T17:19:41 1770743981

Abstractions can take away but many add tremendous value.

For example, the author has coded for their entire career on silicon-based CPUs but never had to deal with the shittiness of wire-wrapped memory, where a bit-flip might happen in one place because of a manufacturing defect and good luck tracking that down. Ever since lithography and CPU packaging, the CPU is protected from the elements and its thermal limits are well known and computed ahead of time and those limits baked into thermal management so it doesn’t melt but still goes as fast as we understand to be possible for its size, and we make billions of these every day and have done for over 50 years.

Moving up the stack you can move your mouse “just so” and click, no need to bit-twiddle the USB port (and we can talk about USB negotiation or many other things that happen on the way) and your click gets translated into an action and you can do this hundreds of times a day without disturbing your flow.

Or javascript jit compilation, where the js engine watches code run and emits faster versions of it that make assumptions about types of variables - with escape hatches if the code stops behaving predictably so you don’t get confusing bugs that only happen if the browser jitted some code. Python has something similar. Thanks to these jit engines you can write ergonomic code that in the typical scenario is fast enough for your users and gets faster with each new language release, with no code changes.

Lets talk about the decades of research that went into autoregressive transformer models, instruction tuning, and RLHF, and then chat harnesses. Type to a model and get a response back, because behind the scenes your message is prefixed with “User: “, triggering latent capabilities in the model to hold its end of a conversation. Scale that up and call it a “low key research preview” and you have ChatGPT. Wildly simple idea, massive implications.

These abstractions take you further from the machine and yet despite that they were adopted en masse. You have to account for the ruthless competition out there - each one would’ve been eliminated if they hadn’t proven to be worth something.

You’ll never understand the whole machine so just work at the level you’re comfortable with and peer behind the curtain if and when you need (eg. when optimizing or debugging).

Or to take a moment to marvel.

cadamsdotcom · 2026-02-09T22:51:41 1770677501

This is nothing new; business gotta pay for itself after all.

But ads don’t have to ruin a great company.

A century or more ago, top tier journalistic institutions created norms of putting strong barriers between the reporting and advertising sides of the house. That kept trust with customers and made journalism a sustainable long term business.

So, It’s mostly Google that couldn’t keep its hands out of the cookie jar (not solely Google, but they’re an industry leader.) It really doesn’t have to go south, it’s not the default, but Google did set the tone for Silicon Valley in exactly the way wise journalism company leaders did for their industry in the late 1800s. If OpenAI has a long term view on this they’ll follow a journalism industry model instead of a cookie jar model - but they have to believe deep down that customer trust is worth more than ad dollars long term.

There are reasons to hope: OpenAI has more and fiercer competition than Google; including Chinese competitors that can’t be lobbied away. Qwen, DeepSeek, Mistral and Kimi all have free chat UIs!

I remain stubbornly optimistic.

cadamsdotcom · 2026-02-09T22:38:53 1770676733

Jony must have got bored of hanging in North Beach with Sam Altman ¯\_(ツ)_/¯