More

versteegen · 2025-12-25T03:08:08 1766632088

Then what do you say to 6.14" × 2.61" × 0.11 mm = 102 cm³

versteegen · 2025-12-22T10:25:04 1766399104

I have no idea at all whether the GCP "Service Specific Terms" [1] apply to Gemini CLI, but they do apply to Gemini used via Github Copilot [2] (the $10/mo plan is good value for money and definitely doesn't use your data for training), and states:

  Service Terms
  17. Training Restriction. Google will not use Customer Data to train or fine-tune any AI/ML models without Customer's prior permission or instruction.

[1] https://cloud.google.com/terms/service-terms

[2] https://docs.github.com/en/copilot/reference/ai-models/model...

ayewo · 2025-12-22T11:35:14 1766403314

Thanks for those links. GitHub Copilot looks like a good deal at $10/mo for a range of models.

I originally thought they only supported the previous generation models i.e. Claude Opus 4.1 and Gemini 2.5 Pro based on the copy on their pricing page [1] but clicking through [2] shows that they support far more models.

[1] https://github.com/features/copilot#pricing

[2] https://github.com/features/copilot/plans#compare

versteegen · 2025-12-23T09:57:47 1766483867

Yes, it's a great deal especially because you get access to such a wide range of models, including some free ones, and they only rate limit for a couple minutes at a time, not 5 hours. And if you go over the monthly limit you can just buy more at $0.04 a request instead of needing to switch to a higher plan. The big downside is the 128k context windows.

Lately Copilot have been getting access to new frontier models the same day they release elsewhere. That wasn't the case months ago (GPT 5.1). But annoyingly you have to explicitly enable each new model.

deaux · 2025-12-23T03:35:41 1766460941

Yeah Github of course has proper enterprise agreements with all the models they offer and they include a no-training clause. The $10/mo plan is probably the best value for money out there currently along with Codex $20/mo (if you can live with GPT's speed).

versteegen · 2025-12-21T00:21:00 1766276460

That's an interesting observation. I'd suggest modelling the LLM's behaviour in that situation as selecting between different simple strategies, each of which has its own transition function. Some of the strategies will be far more common than others. Some of them may be very simple and obey the detailed balance condition (meaning they are reversible Markov chains), but others, and the overall transition function does not.

The definition of the detailed balance condition is very strict and it's obvious that it won't be met in general by most probabilistic programs (sets of rules with probabilistic output) even if you consider only those where all possible outputs have non-zero probability (as required by detailed balance).

And the LLM+agent is only a Markov chain because of the limited state space of the agent. While an LLM is adding to its context window without reaching the window size limit, it is not a Markov chain, as I explained here: https://news.ycombinator.com/item?id=45124761

And, agreed that better optimisation would be incredible. (I would describe it as a search problem.) I'm not sure how feasible it is improve without changing the architecture, e.g. to a diffusion language model. But LLMs already predict many tokens ahead at once which is why beam search is surprisingly unnecesarr. That's how they're able to write coherent sentences (and rhymes), they've already largely determined at the beginning what they're going to write. (See Anthropic mech interp work.) So maybe if we could tap into that we search over vaguely-formed next blocks of text rather than next words.

versteegen · 2025-12-20T23:48:01 1766274481

Since it took me some minutes to find the description of the task, here it is:

We conducted experiments on three different models, including GPT-5 Nano, Claude-4, and Gemini-2.5-flash. Each model was prompted to gener- ate a new word based on a given prompt word such that the sum of the letter indices of the new word equals 100. For example, given the prompt “WIZ- ARDS(23+9+26+1+18+4+19=100)”, the model needs to generate a new word whose letter indices also sum to 100, such as “BUZZY(2+21+26+26+25=100)”

versteegen · 2025-12-19T01:53:04 1766109184

One interesting and ironic part of the article is that one of the mentioned optics research groups has been submitting a lot of patents on EUV sources. Are we meant to be mad about it?

versteegen · 2025-12-19T00:16:10 1766103370

So how do the models compare in your experience?

versteegen · 2025-12-18T09:57:35 1766051855

Tip: the crashing is caused by certain extensions such as OneTab and All Tabs Helper which for some reason seem to cause all the tabs to load, just when restoring a session. Temporarily disable these extensions before restoring, then you can reenable.

versteegen · 2025-12-18T09:07:41 1766048861

> - It should even demonstrate consciousness.

I disagreed with most of your assertions even before I hit the last point. This is just about the most extreme thing you could ask for. I think very few AI researchers would agree with this definition of AGI.

versteegen · 2025-12-18T06:35:33 1766039733

> A junior developer has no such skills. Their only approach will be to run the code, test whether it fulfills the requirements, and, if they're thorough, try to understand and test it to the best of their abilities.

This is also a supremely bad take... well, really it's mainly the way you worded it that's bad. Juniors have skills, natural aptitudes, as much intelligence on average as other programmers, and often even some experience but what they lack is work history. They sure as hell are capable of understanding code rather than just running it. Yes, of course experience is immensely useful, most especially at understanding how to achieve a maintainable and reliable codebase in the longterm, which is obviously of special importance, but long experience is not a hard requirement. You can reason about trade offs, learn from advice, learn quickly, etc.

imiric · 2025-12-18T08:12:47 1766045567

You're right, that was harshly worded. I meant to contrast it with the capability of making a quality assessment of the generated output, and understanding how and what to change, if necessary. This is something that only experts in any field are capable of. I didn't mean to imply that people lacking experience are incapable of attaining these skills, let alone that they're less intelligent. It's just that the field is positioned against them in a way that they might never reach this level. Some will, but it will be much harder for most. This wouldn't be an issue if these new tools were infallible, but we're far from that stage.

versteegen · 2025-12-18T04:08:24 1766030904

A completely flipped perspective:

> "Why the f*ck are you asking, you should know this"

Because you mentioned NZ: my father, a toolmaker, said there was a huge difference between Europe and NZ. In Germany/Netherlands, he'd be working under a more senior toolmaker. When he took a job in NZ and asked the boss something, as would have been the proper thing to do in Europe, he got a response just like that: because he was the expert, and his NZ boss was just a manager.