The math is obvious on this one. It's super well-documented that model performance on complex tasks scales (to some asymptote) with the amount of inference-time compute allocated.
LLM providers must dynamically scale inference-time compute based on current load because they have limited compute. Thus it's impossible for traffic spikes _not_ to cause some degradations in model performance (at least until/unless they acquire enough compute to saturate that asymptotic curve for every request under all demand conditions -- it does not seem plausible that they are anywhere close to this)
Umm. I run multiple benchmark using APIs for my work and the inference time compute allotted has clear correlation with the metrics. But time of the day certainly isn't. If it is that straightforward people can prove very easily rather than relying on the anecdotes.
They either overprovision the server during low demand or they might dynamically provision servers based on load.
Yes, every time I see some variant of this come up (and believe me, this has been coming up since before the GPT3.5 days) there’s never any actual data demonstrating that it’s the case. As you say, it should be completely trivial to run the exact same prompt multiple times per day and capture the output to demonstrate this.
But no one ever seems to do that, they are rather content to “feel” that this is the case instead
For what it's worth, I have lived in, and currently spend a lot of time in, both places. You're both very obviously wrong.
There is a serious problem in the US. There is also a serious (though different) problem in the UK. The problem in the US is the chilling effect of the vindictiveness and lawlessness of the current regime. I will not elaborate on this because it's too complicated to communicate effectively in a forum post.
The problem in the UK is a set of vaguely and arbitrarily specified-and-enforced laws that enable the criminalization of 'grossly offensive" speech. There is no statutory definition of what constitutes a 'grossly offensive' communication -- all enforcement is arbitrary and thus can be abused. Whether is it actually abused in any widespread fashion is irrelevant.
- Communications Act 2003 (Section 127): Makes it an offense to send messages via public electronic networks (internet, phone, social media) that are "grossly offensive," indecent, obscene, or menacing, or to cause annoyance/anxiety.
- Malicious Communications Act 1988 (Section 1): Applies to sending letters or electronic communications with the purpose of causing distress or anxiety, containing indecent or grossly offensive content.
I'm still not quite sure how UK law impacts the US. I was hoping for explicit examples of someone actually being removed from power because they were critical of the president. I think that would be pretty big news and the closest I have heard was one of the ex-military standing congresspeople being threatened with reduced military benefits, or legal action, but not actually anyone being removed from a position.
Another (higher profile) example are the baseless threats of criminal indictments against Jerome Powell -- it is impossible to argue that these threats have been made for any reason other than that he, as a nonpartisan official, defied the president's demands to execute his duties as fed chair in such a way (that is, poorly) so as to put a temporary thumb on the scale for the current admin.
The more important question, I think, is how many folk in explicitly nonpartisan functions are choosing not to break step with the current admin for fear of some sort of (likely professional) reprisal. I'm not alleging that they're disappearing dissenters or anything that inflammatory, but it would be intellectually dishonest to contend that there isn't a long, well-documented trail of malfeasance here.
The sycophancy is obviously intentional. People are vulnerable to it, and addiction is profitable. It has nothing to do with the nature of LLMs and everything to do with user engagement metrics.
You can certainly do it with RAII. However, what if a language lacks RAII because it prioritizes explicit code execution? Or simply want to retain simple C semantics?
Because that is the context. It is the constraint that C3, C, Odin, Zig etc maintains, where RAII is out of the question.
Ok then I understand what you mean (I couldn't respond directly to your answer, maybe there is a limit to nesting in HN?).
Let me respond in some more detail then to at least answer why C3 doesn't have RAII: it tries to the follow that data is inert. That is – data doesn't have behaviour in itself, but is acted on by functions. (Even though C3 has methods, they are more a namespacing detail allowed to create methods that derive data from the value, or mutate it. They are not intended as organizational units)
To simplify what the goal is: data should be possible to create or destroy in bulk, without executing code for each individual element. If you create 10000 objects in a single allocation it should be as cheap to free (or create) as a single object.
We can imagine things built into the type system, but then we will need these unsafe constructs where a type is converted from its "unsafe" creation to its "managed" type.
I did look at various cheap ways of doing this through the type system, but it stopped resembling C and seemed to put the focus on resource management rather than the problem at hand.
The idea is, you could have a language like Rust, but with linear rather than affine types. Such a language would have RAII-like idioms, but no implicit destructors; instead, it'd be a compile-time error to have a non-Copy local variable whose value is not always moved out of it before its scope ends (i.e., to write code that in Rust could include an implicit destructor call). So you would have explicit deallocation functions like in C, but unlike in C you could not have resource leaks from forgetting to call them, because the compiler would not let you.
To the extent that you subscribe to a principle like "invisible function calls are never okay", this solves that without undermining Rust's safety story more broadly. I have no idea whether proponents of "better C" type languages have this as their core rationale; I personally don't see the appeal of that flavor of language design.
It is about types that can't be copied and can't go out of scope, and the only way to destroy them is to call one of their destructors. This is compile time checkable.
In theory they can solve a lot of problems easily, mainly resource management. Also it generalizes C++'s RAII, and similar to Rust's ownership.
In practice they haven't got support in any mainstream programming language yet.
I'd keep in mind that internet usage of 96 (I was there) bears no resemblance whatsoever to internet usage of today. The level of predatory sophistication of today's attention economy makes any sort of comparison between the two misguided at best.
Yes, but complaints about my generation sitting in front of computers was not that much different from my generation's complaints now of the next generation being on social media.
As opposed to taking like 30 seconds to install cargo and rust?
I get that the elegant thing to do would be to bootstrap this, but in practice does this actually cost you anything, or is this a purely aesthetic concern?
> As opposed to taking like 30 seconds to install cargo and rust?
I think you're oblivious to the problem domain. C and C++ projects are tightly coupled with build systems. If you are not smack middle in the happy path, you will experience problems. Having to onboard an external language and obscure toolset just to be able to start a hello world is somewhere between a hard sell and an automatic rejection.
I recently tried Cursor for about a week and I was disappointed. It was useful for generating code that someone else has definitely written before (boilerplate etc), but any time I tried to do something nontrivial, it failed no matter how much poking, prodding, and thoughtful prompting I tried.
Even when I tried to ask it for stuff like refactoring a relatively simple rust file to be more idiomatic or organized, it consistently generated code that did not compile and was unable to fix the compile errors on 5 or 6 repromptings.
For what it's worth, a lot of SWE work technically trivial -- it makes this much quicker so there's obviously some value there, but if we're comparing it to a pair programmer, I would definitely fire a dev who had this sort of extremely limited complexity ceiling.
It really feels to me (just vibes, obviously not scientific) like it is good at interpolating between things in its training set, but is not really able to do anything more than that. Presumably this will get better over time.
If you asked a junior developer to refactor a rust program to be more idiomatic, how long would you expect that to take? Would you expect the work to compile on the first try?
I love Cline and Copilot. If you carefully specify your task, provide context for uncommon APIs, and keep the scope limited, then the results are often very good. It’s code completion for whole classes and methods or whole utility scripts for common use cases.
"If you asked a junior developer to refactor a rust program to be more idiomatic, how long would you expect that to take? Would you expect the work to compile on the first try?"
The purpose of giving that task to a junior dev isn't to get the task done, it's to teach them -- I will almost always be at least an order order of magnitude faster than a junior for any given task. I don't expect juniors to be similarly productive to me, I expect them to learn.
The parent comment also referred to a 'competent pair programmer', not a junior dev.
My point was that for the tasks that I wanted to use the LLM, frequently there was no amount of specificity that could help the model solve it -- I tried for a long time, and generally if the task wasn't obvious to me, the model generally could not solve it. I'd end up in a game of trying to do nondeterministic/fuzzy programming in English instead of just writing some code to solve the problem.
Again I agree that there is significant value here, because there is a ton of SWE work that is technically trivial, boring, and just eats up time. It's also super helpful as a natural-language info-lookup interface.
I (like a very large plurality, maybe even a majority, of devs) do not work for a consulting firm. There is no client.
I've done consulting work in the past, though. Any leader who does not take into account (at least to some degree) relative educational value of assignments when staffing projects is invariably a bad leader.
All work is training for a junior. In this context, the idea that you can't ethically train a junior "on a client's dime" is exactly equivalent to saying that you can't ever ethically staff juniors on a consulting project -- that's a ridiculous notion. The work is going to get done, but a junior obviously isn't going to be as fast as I am at any task.
What matters here is the communication overhead not how long between responses. If I’m indefinitely spending more time handholding a jr dev than they save me eventually I just fire em, same with code gen.
A big difference is that the jr. dev is learning compared to the AI who is stuck at whatever competence was baked in from the factory. You might be more patient with the jr if you saw positive signs that the handholding was paying off.
That was my point, though I may not have been clear.
Most people do get better over time, but for those who don’t (or LLM’s) it’s just a question of if their current skills are a net benefit.
I do expect future AI to improve. My expectation is it’s going to be a long slow slog just like with self driving cars etc, but novel approaches regularly turn extremely difficult problems into seemingly trivial exercises.
Without commenting on the (important) political or reputational considerations here, I want to talk a bit about the operational risk presented by this practice. There is a somewhat sizable "So what? Signal is e2e encrypted. Nothing bad happened and you're all overreacting." narrative floating around. (not so much in this thread, but in the general discourse)
If this operation was planned in Signal, then so were countless others (and presumably so would countless others be in the future).
If not for this journalist, this would likely have continued indefinitely. We have high confidence that at least some of the officials were doing this on their personal phones. (Gabbard refused to deny this in the congressional hearing -- it does not stand to reason that she'd do that unless she was, in fact using her personal phone).
At some point in the administration, it's likely that at least one of their personal phones will be compromised (Pegasus, etc). E2E encryption isn't much use if the phone itself is compromised. This is why we have SCIFs.
There was no operational fallout of this particular screwup, but if this practice were to continue, it's likely certain that an adversary would, at some point, compromise these communications. Not through being accidentally invited to the chat rooms, but through compromise of the participants' hardware. An APT could have advance notice of all manner of confidential and natsec-critical plans.
In all likelihood this would lead to failed operations and casualties. The criticism/pushback on this is absolutely justified.
Or not even the device: The other reason we have SCIFs is they provide a secure location. These personal devices could have been in use anywhere, including places where they were subject to observation. Including but not limited to Moscow. :)
Something I havnt seen discussed is that you can get the information from signal without compromising the phone or person. Just reading the texts "over the shoulder" would be enough of a leak. Being in Moscow is bad, but even a Starbucks has security cameras good enough to read text on a phone. A SCIF would fix that
I agree with all of this, my only quibble is that I would bet there have already been costs associated with this idiocy. Hostile powers knew going in that this would be an incompetently run administration and I'm sure were looking at gaining access to personal devices out of the gate. It's possible that a great many highly sensitive conversations have already been read by adversaries. I also expect that similar sloppiness like adding the wrong person to a Signal chat has already happened without being reported on.
Yes, this was one of the main points on infosec Mastodon today. While everyone is aware enough to be concerned with encryption over the wire, it's the endpoints that matter. Personal Android devices capable of running Signal are going to be some of the easiest to compromise for a sufficiently motivated attacker. I've seen n00b cops do it for drug gangs here. There's no question that Russia, China, et al. can do it just as well and we have as good as much as confirmation that that's what's going on in at least Tulsi Gabbard's case.
Not on Android. You can set your Signal PIN, which is a recovery code for if you lose your phone and are locked out of your Signal account. You cannot change the lock screen PIN, which is the same as that of your phone.
I suspect we won't know the true damage until all these people are gone, kind of like how Apollo 13 didn't know the true damage to the service module until they jettisoned it.
My prediction is, given the way the narrative is shifting to digging in their heels and insisting they did nothing wrong, the lesson they are learning from all this is that they should have hid their activity better. Nothing will happen to them, they will continue with impunity, and they'll just be more careful about not inviting outsiders. I suspect this isn't the last leaked top-secret group chat we'll see.
This assertion is sharply undercut by the facts. I have an incredibly hard time believing that you're engaging in good faith here.
There is literally zero evidence whatsoever that Russia cares about 'equality for ordinary people' and a mountain of conclusive proof that it does not.
Ukraine did not owe Russia anything at all, so these 'negotiations' were nothing more than theater. Russia gave Ukraine the choice between either surrendering their sovereignty (for literally zero benefit in exchange) or being invaded. That is not a negotiation, that's state-sponsored terrorism.
For example It is clearly that some Ukraine nationalist did bloody crimes before the war even if Russian media exegarates it. Even the European Court of Justice has acknowledged crimes on the side of Ukraine.
I'm Russian, but it is my real opinion. I don't get paid anything for it. And I understand that not all Russian(goverment) actions are good, some were incorrect or questionable. Russia just don't want NATO expansion to the East even without transparent referendums. It's all very complicated in reality, in war no side perfectly correct and right and clean... :(
Why does Russia have any right to say whether sovereign countries on its borders join NATO or not?
The only reason Russia cares is because it wants to continue controlling them -- not because it's worried about the mythical NATO invasion of Russia its news and leader trumpet.
And in contrast, the only reasons those countries want to join NATO is because they're scared of Russia invading them, which it historically has. (See: Finland and eastern Europe)
Why US and EU worried about Nuclear Weapon in Iran?(I've exaggerated a bit here for an example).
NATO has more troops and equipment than Russia, it does not need to be afraid of Russia and seeks to expand even more.
To be sure even about majority opinion in Finland about joining NATO, in such serios questions you need referendum data but there is no such referendum. Even supporters of the West are not always in favor of joining a purely military and not only defensive NATO Alliance.
Yes USSR invading Finland in Soviet-Finland war it's bad, the USSR offered Finland a territory in return before the war, but unfortunately, it did not seem very profitable, but then, during WW2 for most of the time, Finland fought on the side of the German Axis coalition. And Finland did not fight quite adequately and also committed crimes, created concentration camps to isolate peoples who were not ethnically related to Finns and ("non-indigenous peoples") to move out of the territories where these people lived all their lives and many people died in these camps and there is some evidence of crimes in these camps. If someone want to take away something from you, for example, a part of the territory, then would it be adequate to ask for help from a notorious bandit(Hitler) who burns people? Such question has no good answer.
I'm not oneside propagandist. I just want that more people try to see things from all sides and analyse more information. Maybe I'm wrong.
In countries where there is a very significant part of the Russian-speaking and sympathetic to the Russia population, Russia wants their opinion(russian speaking people) to be taken into account, they are not forbidden to speak and study Russian in schools. Yes, sometimes they exaggerate reasonable demands. But I recognize that such countries have the right to require that all official documents be in the main language and the officials need to know the main language. I think it's not that Russia want fully control of this Countries. Russia wants trade and interact economically with these countries, and not just to have all Russian goods blocked or subject to huge duties without reasons.
Sorry for lots of text. And I may be mistaken in some points.
You need to rethink your information environment, you are repeating many false claims that I recognise from past propaganda.
For instance your view of NATO membership is fundamentally flawed as it assumes a NATO push to take on more members, when in reality even the most shallow research shows that it was actually based on a pull from countries who lobbied to be able to join NATO and had to jump through hoops to qualify.
Why did those countries want to join NATO? Because they recognised that, alone, they were vulnerable to what’s clearly a revanchist Russia looking to annex or otherwise control other countries in the region. By being part of a broad security alliance like NATO those countries made themselves safer from Russian attacks.
As for Russian speakers in Ukraine, I know many Ukrainians, most of whom from the east who learnt Russian as a first language. All but one of them absolutely detest Russia, have nothing good to say about Russians in general, who they see as complicit, and have become even more fiercely pro-Ukrainian and patriotic than they were before the war. Many have chosen to speak Ukrainian primarily, despite it being their second language.
And why wouldn’t they? Russia’s invasion destroyed their homes and their way of life, levelling entire cities, and killed tens of thousands of Ukrainians. The idea that all of this was done in their name or to their benefit is insulting.
LLM providers must dynamically scale inference-time compute based on current load because they have limited compute. Thus it's impossible for traffic spikes _not_ to cause some degradations in model performance (at least until/unless they acquire enough compute to saturate that asymptotic curve for every request under all demand conditions -- it does not seem plausible that they are anywhere close to this)
reply