So is maybe gpt-5.2 with reasoning set to 'none' identical to gpt-5.2-chat-latest in capabilities but perhaps with a different system (system) prompt? I notice chat-latest doesn't accept temperature or reasoning (which makes sense) parameters, so something is certainly different underneath?
> ChatGPT is widely used for practical guidance, information seeking, and writing, which together make up nearly 80% of usage. Non-work queries now dominate (70%). Writing is the main work task, mostly editing user text. Users are younger, increasingly female, global, and adoption is growing fastest in lower-income countries
That's funny, the way I interpreted this sentence is that usage was already high in older, male, and high-income countries so most of the new users are coming from outside these demographics. Which, ironically, is the exact opposite of what you're saying.
You read "Users are younger, increasingly female, global, and adoption is growing fastest in lower-income countries" and gathered that "Young moms with no money in poor countries use this product the most". Do I really need to spell out the fact that you completely failed to understand basic English here?
If they mostly ask how to raise their children and follow the received advice... Then yeah, in some 20 years we'll see what kind of return we get. People raised on social media are one thing; people raised by (with the assistance of) ChatGPT may be even worse off because of it.
My initial assumption would be that there are a lot, likely a majority, of parents who have had next to no advice on how to raise kids. Furthermore, I would posit that many of them were not raised in particularly nurturing circumstances themselves.
As such, I would expect that the advice ChatGPT gives (i.e. an average from parenting advice blogs and forums), would on average result in better parenting.
That's obviously not to say that ChatGPT gives great advice, but that the bar is very low already.
You're right, as much as I'd like not to be aware of it. Indeed, the bar is very low.
Whether heeding ChatGPT advice would be better or worse than no advice at all, I honestly cannot say. On the one hand, getting some advice would probably help in many, many cases - there's a lot of low-hanging fruit here; on the other, low-quality advice has the potential to ruin the lives of multiple people at any moment. This is like medical or lawyer advice: very high stakes in many cases. Should we rely on a model that doesn't really understand the underlying logic for advice on such matters? The "average" of parenting blogs can be a mish-mash of different philosophies or approaches glued together, making up something that sounds plausible but leads to catastrophic results years or decades later.
I don't know. Parenting is a complex problem in itself; then you have people generally not looking for advice or being unable to recognize good advice. It doesn't look like adding a hallucinating AI model to the mix would help much, but I may be wrong on this. I guess we'll find out the hard way: through people trying (or not) it out and then living with consequences (if any).
No amount of LinkedEn speech can fix the poor part of it.
In 2025, it's abundantly clear that the mask is off. Only the whales matter in video games. Only the top donors matter in donation funding. Modern laptops with GPUs are all $2k+ dollars machines. Luxury condos are everywhere. McDonalds revenues and profits are up despite pricing out a lot of low income people.
The poor have less of the nothing they already have. You can make a hundred affordable cars or get as much, if not order of magnitudes more, profit with just one luxury vehicle sale.
Most political donors are $25/month Actblue donations, and it doesn't matter because the campaigns with the most donations regularly lose.
> McDonalds revenues and profits are up despite pricing out a lot of low income people.
They didn't really raise prices, they just put coupons in the app.
> Luxury condos are everywhere.
Houses don't cost more because they have "luxury" features. A nicer countertop doesn't hypnotize people into paying more for a house. Prices are negotiated between buyer and seller and most of the development cost is the land price.
> The poor have less of the nothing they already have.
Wage inequality in the US is lower than it was in 2019. In general income inequality hasn't increased since 2014.
You have no idea if they’re ambitious or educated. Absolutely no idea. Is it just commonplace to inject “facts” into conjecture? Comes off as desperate.
I get a lot of productivity out of LLMs so far, which for me is a simple good sign. I can get a lot done in a shorter time and it's not just using them as autocomplete. There is this nagging doubt that there's some debt to pay one day when it has too loose a leash, but LLMs aren't alone in that problem.
One thing I've done with some success is use a Test Driven Development methodology with Claude Sonnet (or recently GPT-5). Moving forward the feature in discrete steps with initial tests and within the red/green loop. I don't see a lot written or discussed about that approach so far, but then reading Martin's article made me realize that the people most proficient with TDD are not really in the Venn Diagram intersection of those wanting to throw themselves wholeheartedly into using LLMs to agent code. The 'super clippy' autocomplete is not the interesting way to use them, it's with multiple agents and prompt techniques at different abstraction levels - that's where you can really cook with gas. Many TDD experts have great pride in the art of code, communicating like a human and holding the abstractions in their head, so we might not get good guidance from the same set of people who helped us before. I think there's a nice green field of 'how to write software' lessons with these tools coming up, with many caution stories and lessons being learnt right now.
It feels like Tdd/llm connection is implied — “and also generate tests”. Thought it’s not cannonical tdd of course. I wonder if it’ll turn the tide towards tech that’s easier to test automatically, like maybe ssr instead of react.
Yep, it's great for generating tests and so much of that is boilerplate that it feels great value. As a super lazy developer it's great as the burden of all that mechanical 'stuff' being spat out is nice. Test code being like baggage feels lighter when it's just churned out as part of the process, as in no guilt just to delete it all when what you want to do changes. That in itself is nice. Plus of course MCP things (Playwright etc) for integration things is great.
But like you said, it was meant more TDD as 'test first' - so a sort of 'prompt-as-spec' that then produces the test/spec code first, and then go iterate on that. The code design itself is different as influenced by how it is prompted to be testable. So rather than go 'prompt -> code' it's more an in-between stage of prompting the test initially and then evolve, making sure the agent is part of the game of only writing testable code and automating the 'gate' of passes before expanding something. 'prompt -> spec -> code' repeat loop until shipped.
The only thing I dislike is what it chooses to test when asked to just "generate tests for X": it often chooses to build those "straitjacket for your code" style tests which aren't actually useful in terms of catching bugs, they just act as "any change now makes this red"
As a simple example, a "buildUrl" style function that put one particular host for prod and a different host for staging (for an "environment" argument) had that argument "tested" by exactly comparing the entire functions return string, encoding all the extra functionality into it (that was tested earlier anyway).
A better output would be to check startsWith(prodHost) or similar, which is what I changed it into, but I'm still trying to work out how to get coding agents to do that in the first or second attempt.
But that's also not surprising: people write those kinds of too-narrow not-useful tests all the time, the codebase I work on is littered with them!
LLMs (Sonnet, Gemini from what I tested) tend to “fix” failing tests by either removing them outright or tweaking the assertions just enough to make them pass. The opposite happens too - sometimes they change the actual logic when what really needs updating is the test.
In short, LLMs often get confused about where the problem lies: the code under test or the test itself. And no amount of context engineering seems to solve that.
I think in part the issue is that the LLM does not have enough context. The difference between a bug in the test or a bug in the implementation is purely based on the requirements which are often not in the source code and stored somewhere else(ticket system, documentation platform).
Without providing the actual feature requirements to the LLM(or the developer) it is impossible to determine which is wrong.
Which is why I think it is also sort of stupid by having the LLM generate tests by just giving it access to the implementation. That is at best testing the implementation as it is, but tests should be based on the requirements.
Oh, absolutely, context matters a lot. But the thing is, they still fail even with solid context.
Before I let an agent touch code, I spell out the issue/feature and have it write two markdown files - strategy.md and progress.md (with the execution order of changes) inside a feat_{id} directory. Once I’m happy with those, I wipe the context and start fresh: feed it the original feature definition + the docs, then tell it to implement by pulling in the right source code context. So by the time any code gets touched, there’s already ~80k tokens in play. And yet, the same confusion frequently happens.
Even if I flat out say “the issue is in the test/logic,”, even if I point out _exactly_ what the issue is, it just apologizes and loops.
At that point I stop it, make it record the failure in the markdown doc, reset context, and let it reload the feature plus the previous agent’s failure. Occasionally that works, but usually once it’s in that state, I have to step in and do it myself.
Not sure. If the Flash image output is $30/M [1] then that's pretty similar to gpt-image-1 costs. So a faster and better model perhaps but not really cheaper?
Since I can't edit, it seems like Flash image is about 23% (4 cents vs 17 cents) of the cost of Openai gpt-image-1, if you're putting an image and prompt in and getting out, say, a 1024x1024 generated image. With the quicker production time that makes it interesting. Expecting Openai to respond at least in terms of pricing, e.g. a flat rate output cap price or something to be comparable.
Maybe if there was some political will for building stuff but there isn't. Canada should be an absolute AI and energy powerhouse, but our politicians are some of the most incompetent buffoons on the planet.
I don't know enough about Canada to know if this is a reasonable take or not, but I think you'd get downvoted less if you took a few sentences to articulate what the politicians' main failings are.
That would take more than a few sentences, but in general there is a lack of willingness to build new infrastructure. Canada has endless opportunities to both export energy (and not just oil!) and use it domestically - we should be utilising this untapped potential to build datacenters and invest in AI companies and research. Instead we can't even build new houses or hospitals for our exploding population.
So the question becomes whether these countries truly want to move off of the platform, or if this is all more of a bargaining chip in the trade negotiations.
JD Vance pretty much single-handedly destroyed most trust in the US in with his speech at the Munich Security Conference. Europe (and probably Canada and Australia) were shaken for days after it and realized that the US is not a reliable ally (or even not an ally) anymore. This was confirmed by the disastrous meeting with Zelensky in the White House and the US stopping to provide intelligence to Ukraine and F-16 updates (F-16s which were provided by European countries, not the US).
The pathetic little show you saw at the White House last week (with Macron, Mertz, etc.) is just a strategy to appease the US as long as needed so that the Europe can speed up its own weapon's production, increase independence, etc. It's damage control. The reason countries have stopped buying the F-35 is because nobody trusts the US anymore. And one or two sane presidents are not going to fix it (the US elected Trump a second time after all).
It is interesting how it is basically an indictment on the ability of the american people to manage their hard and soft power and military capability. That being said, populist right wing movements are taking root in europe as well. This threatens long term strategic planning in general, not just with the US, when critical positions of world power are replaced every few years by a subset of the population increasingly liable to propaganda influence granted by technology. In some ways regimes like North Korea are the most stable on earth due to careful control of the reigns of power and lack of any possibility of inroads for third party influence.
It's crazy that you're acting like this is some kind of policy failure for the US, when this administration has been telling Europe it shouldn't rely on the United States at this level. This isn't some "gotcha" that you're describing, it's exactly what the administration wanted europe to do. Wake up and start innovating instead of being the Disneyland for American tourists.
Us Europeans are just baffled by the fact that this ‘administration’ wants this. The EU is a big economy that’s relatively easy to deal with. Why would you alienate us?
But yeah so far Trump has been relatively true to his word, as far as it goes. Not really practically but going further down the road of a dare I say fascist outlook. I think Europeans still can’t believe it’s happening, much less intentionally so.
You are effectively saying that Europe should be a vassal state to the US and cannot have its own laws. Europe has a different vision on privacy and competition. The regulation asks for e.g. Apple are peanuts compared to what China asks. Apple bends back over to please China, but if Europe has some requirements for doing business part of the US do the tired trope “US innovates, Europe regulates”.
We have too many problems at home to be daddy with a credit card.
First, this is rich for a country living on borrowed money (that they can only get away with because the rest of the world uses it as the default currency).
Second, a lot of the problems of the US are caused by the lack of proper wealth redistribution, lack of efficient health care (no, the US doesn’t subsidize European healthcare, European countries spend far less on healthcare with better outcomes). It’s not solved by throwing lifelong allies under the bus and trading the for some dictator friends.
Finally, the security situation also arisen because the US did not want European militaries to become too powerful and has pushed a lot to be dependent on the US and US tech. For instance, countries have to buy US fighters for nuclear sharing, etc. The primary exception is France because they never wanted to be reliant and have their own nuclear force, etc.
Also let’s not forget Article 5 was only invoked once (by the US) and we were happy to help, because that’s what friends do. We have been in Afghanistan for over 20 years as a result and a lot of our soldiers died and were injured.
> The fact that you can't understand this is exactly why this administration is doing this.
> because you believe that excellence is not worthy of being rewarded. Your culture has the mindset that excellence is not a product of hard work and determination, it's a product of luck and nepotism, so any hint of excellence gets taken away and diminished.
This administration does not believe in rewarding excellence, hard work, or determination. It’s an administration by the most malicious, incompetent people who have ever led this country.
I don't know why Europe wants so badly to be reliant on the US. It's bad for them, it's bad for us. It's embarrassing for Europe that Ukraine is relying on the US instead of Europe for defense. It's embarrassing for Europe how little they contribute to NATO. The US isn't a partner, it's a caretaker. And as they say, if someone provides what you need, they also have the power to take it away.
Outsourcing your defense is stuupiiid.
Europe should be thanking Trump for waking them up to the reality that has always been the case through his boorish negotiation.
Defense is a bit like advertising or finance. It has some aspects of a zero-sum game and a negative-sum game. All the money you invest in it is wasted. But if your enemy/competitor chooses to waste more money, you may be in trouble.
From an European perspective, the entire purpose of NATO from 1992 to 2022 was to prevent wasting too much money on defense. Because, for some reason, Americans were willing to do it instead.
Then Russia invaded Ukraine, and the calculus changed. Now European countries are rebuilding their defensive capabilities, while Russia is still bogged down in Ukraine. Given the lack of credible short-term threats, limiting defense spending was clearly the right choice until 2022.
Also it makes sense to have a capability only once within an alliance. If the US has the command, space and air capabilities, why would anyone else need to have this. You can add to their capability by buying F-35s and hosting their air bases.
Now that we are not allies anymore we need to wastefully build up our own command, space and air capabilties resulting in duplicated effort.
>> Ukraine is relying on the US instead of Europe for defense.
Is it? Especially in 2025.
It is embarrassing how little of (very old) heavy equipment USA provided to Ukraine. North Macedonia provided same amount of main battle tanks as USA, Poland provided ten times more. And zero fighter jets.
Anyway, people of Ukraine are thankful for any support and USA was the biggest donor during first years of the war.
So it’s true that Europe/Canada spent less, but it comes with a bit fat asterisk that the US also wants to project power in the pacific/Asia, whereas European defense is primarily focused on avoiding Russian aggression (+ peace missions + supporting the US in various operations to give them more legitimacy).
Europe should be thanking Trump for waking them up to the reality that has always been the case through his boorish negotiation.
That credit should go to Putin, European spending has grown rapidly since the annexation of crimea.
The credit the Trump should get: stop buying US weapons as quickly as we can and focus on non-US alternatives. It’s going to take a while, but US material has certainly become less attractive.
The most fun one I've used is that it is my home environment in VR. In 3D it is a weird feeling to walk around and see how all the old sight lines are. I still duck a bit walking past mid doors :)
I had a mild addiction to this game about 7 or so years ago. Purely casual but lots of hours. I found it sort of a stress relief.
On the upside it gave me all sorts of free items as in-game 'drops'. I ignored them all at the time as didn't care about buying keys or cosmetics. Last year I saw that they'll worth a bunch of money now (!) and had about $1500 if sold on the steam marketplace. I got a Steam Deck with money from some of them, and it's basically my C:S 401k for steam games. What a weird world.
Can you pick thinking models with this or is that implied?
GPT-5 seems a bit slow so far (in terms of deciding and awareness). I’ve gone from waiting for a compiler, to waiting for assets to build to now waiting for an agent to decide what to do - progress I guess :)
For me most of the performance issues with MSFS24 are now being VRAM limited. When they went to MSFS 2024 they rewrote for DX12 and while doing that upgraded a few things to look nicer. The texture management still seems to need some work.
This means that my 9800x3D/3080Ti 12GB sort of runs out of VRAM and pages when used in VR or 4K desktop. I'm in the position where the same visuals (scenery/aircraft etc) for MSFS2020 (using DX11) when compared to the newer MSFS2024 is just generally worse and a lower framerate. In VR a bad framerate makes things unplayable. For desktop use you have DLSS which helps a lot, but in VR that blurry movement really impacts clarity.
DLSS also blurs the cockpit displays quite badly when there's anything moving on them (airspeed/altitude tape, etc.). It looks like temporal blur, which is interesting because the same blur doesn't happen with their TAA (*temporal* anti-aliasing) implementation.
Yeah, DLSS looks great outside the window and for people enjoying GA and VFR it's all good. For airliners with digital displays it is harder to use, as like you said it blurs. They did talk about using some form of stencil/exclusion around cockpit displays but I think that didn't go anywhere as yet.
As a team they have a pretty tough job because the audience is all of the place for a title like that, as in people with 'I can see my house!' being made happy vs 'My pressurization gauge cross-bleed reading is below the Boeing B738 manual official figure, unplayable'.
That describes what I've seen. When I first compared 2020 and 2024 in as apples-to-apples a way as I could, it seemed like 2024's frame rate was about a third lower than 2020's. This was on a 7900 XTX with 24 gigs of VRAM.