Hacker Newsnew | past | comments | ask | show | jobs | submit | daviding's commentslogin

gpt-5.2 and gpt-5.2-chat-latest the same token price? Isn't the latter non-thinking and more akin to -nano or -mini?


No. It is the same model without reasoning.


So is maybe gpt-5.2 with reasoning set to 'none' identical to gpt-5.2-chat-latest in capabilities but perhaps with a different system (system) prompt? I notice chat-latest doesn't accept temperature or reasoning (which makes sense) parameters, so something is certainly different underneath?


Not going to read all that.. ;)

> ChatGPT is widely used for practical guidance, information seeking, and writing, which together make up nearly 80% of usage. Non-work queries now dominate (70%). Writing is the main work task, mostly editing user text. Users are younger, increasingly female, global, and adoption is growing fastest in lower-income countries


> Users are younger, increasingly female, global, and adoption is growing fastest in lower-income countries

Young moms with no money in poor countries use this product the most. I bet that was fun news to deliver up the chain.


That's funny, the way I interpreted this sentence is that usage was already high in older, male, and high-income countries so most of the new users are coming from outside these demographics. Which, ironically, is the exact opposite of what you're saying.


That’s funny, you miscomprehended English.


You read "Users are younger, increasingly female, global, and adoption is growing fastest in lower-income countries" and gathered that "Young moms with no money in poor countries use this product the most". Do I really need to spell out the fact that you completely failed to understand basic English here?


Surely this user base can make back the hundreds of billions of dollars they invested in it.


If they mostly ask how to raise their children and follow the received advice... Then yeah, in some 20 years we'll see what kind of return we get. People raised on social media are one thing; people raised by (with the assistance of) ChatGPT may be even worse off because of it.


Interesting. Do you really think that?

My initial assumption would be that there are a lot, likely a majority, of parents who have had next to no advice on how to raise kids. Furthermore, I would posit that many of them were not raised in particularly nurturing circumstances themselves.

As such, I would expect that the advice ChatGPT gives (i.e. an average from parenting advice blogs and forums), would on average result in better parenting.

That's obviously not to say that ChatGPT gives great advice, but that the bar is very low already.


You're right, as much as I'd like not to be aware of it. Indeed, the bar is very low.

Whether heeding ChatGPT advice would be better or worse than no advice at all, I honestly cannot say. On the one hand, getting some advice would probably help in many, many cases - there's a lot of low-hanging fruit here; on the other, low-quality advice has the potential to ruin the lives of multiple people at any moment. This is like medical or lawyer advice: very high stakes in many cases. Should we rely on a model that doesn't really understand the underlying logic for advice on such matters? The "average" of parenting blogs can be a mish-mash of different philosophies or approaches glued together, making up something that sounds plausible but leads to catastrophic results years or decades later.

I don't know. Parenting is a complex problem in itself; then you have people generally not looking for advice or being unable to recognize good advice. It doesn't look like adding a hallucinating AI model to the mix would help much, but I may be wrong on this. I guess we'll find out the hard way: through people trying (or not) it out and then living with consequences (if any).


A strong foothold among an ambitious, educated, technologically-connected cohort in emerging economies? Yes please.


No amount of LinkedEn speech can fix the poor part of it.

In 2025, it's abundantly clear that the mask is off. Only the whales matter in video games. Only the top donors matter in donation funding. Modern laptops with GPUs are all $2k+ dollars machines. Luxury condos are everywhere. McDonalds revenues and profits are up despite pricing out a lot of low income people.

The poor have less of the nothing they already have. You can make a hundred affordable cars or get as much, if not order of magnitudes more, profit with just one luxury vehicle sale.


> Only the top donors matter in donation funding.

Most political donors are $25/month Actblue donations, and it doesn't matter because the campaigns with the most donations regularly lose.

> McDonalds revenues and profits are up despite pricing out a lot of low income people.

They didn't really raise prices, they just put coupons in the app.

> Luxury condos are everywhere.

Houses don't cost more because they have "luxury" features. A nicer countertop doesn't hypnotize people into paying more for a house. Prices are negotiated between buyer and seller and most of the development cost is the land price.

> The poor have less of the nothing they already have.

Wage inequality in the US is lower than it was in 2019. In general income inequality hasn't increased since 2014.

https://www.nber.org/papers/w31010


Distribution of wealth and disposable income need correction. It’s urgent political issue.


You have no idea if they’re ambitious or educated. Absolutely no idea. Is it just commonplace to inject “facts” into conjecture? Comes off as desperate.


Megan Markle?

Is that you?


user: hey hermes, why is your website scroll bar ungrabbable, I can't go up the page anymore? I'm stuck but want to read something higher up the page?

hermes4: We're all just stupid atoms waiting for inevitable entropy to plunge us into the endless darkness, let it go.


I get a lot of productivity out of LLMs so far, which for me is a simple good sign. I can get a lot done in a shorter time and it's not just using them as autocomplete. There is this nagging doubt that there's some debt to pay one day when it has too loose a leash, but LLMs aren't alone in that problem.

One thing I've done with some success is use a Test Driven Development methodology with Claude Sonnet (or recently GPT-5). Moving forward the feature in discrete steps with initial tests and within the red/green loop. I don't see a lot written or discussed about that approach so far, but then reading Martin's article made me realize that the people most proficient with TDD are not really in the Venn Diagram intersection of those wanting to throw themselves wholeheartedly into using LLMs to agent code. The 'super clippy' autocomplete is not the interesting way to use them, it's with multiple agents and prompt techniques at different abstraction levels - that's where you can really cook with gas. Many TDD experts have great pride in the art of code, communicating like a human and holding the abstractions in their head, so we might not get good guidance from the same set of people who helped us before. I think there's a nice green field of 'how to write software' lessons with these tools coming up, with many caution stories and lessons being learnt right now.

edit: heh, just saw this now, there you go - https://news.ycombinator.com/item?id=45055439


It feels like Tdd/llm connection is implied — “and also generate tests”. Thought it’s not cannonical tdd of course. I wonder if it’ll turn the tide towards tech that’s easier to test automatically, like maybe ssr instead of react.


Yep, it's great for generating tests and so much of that is boilerplate that it feels great value. As a super lazy developer it's great as the burden of all that mechanical 'stuff' being spat out is nice. Test code being like baggage feels lighter when it's just churned out as part of the process, as in no guilt just to delete it all when what you want to do changes. That in itself is nice. Plus of course MCP things (Playwright etc) for integration things is great.

But like you said, it was meant more TDD as 'test first' - so a sort of 'prompt-as-spec' that then produces the test/spec code first, and then go iterate on that. The code design itself is different as influenced by how it is prompted to be testable. So rather than go 'prompt -> code' it's more an in-between stage of prompting the test initially and then evolve, making sure the agent is part of the game of only writing testable code and automating the 'gate' of passes before expanding something. 'prompt -> spec -> code' repeat loop until shipped.


The only thing I dislike is what it chooses to test when asked to just "generate tests for X": it often chooses to build those "straitjacket for your code" style tests which aren't actually useful in terms of catching bugs, they just act as "any change now makes this red"

As a simple example, a "buildUrl" style function that put one particular host for prod and a different host for staging (for an "environment" argument) had that argument "tested" by exactly comparing the entire functions return string, encoding all the extra functionality into it (that was tested earlier anyway).

A better output would be to check startsWith(prodHost) or similar, which is what I changed it into, but I'm still trying to work out how to get coding agents to do that in the first or second attempt.

But that's also not surprising: people write those kinds of too-narrow not-useful tests all the time, the codebase I work on is littered with them!


> It feels like Tdd/llm connection is implied — “and also generate tests”.

That sounds like an anti-pattern and not true TDD to get LLMs to generate tests for you if you don't know what to test for.

It also reduces your confidence in knowing if the generated test does what it says. Thus, you might as well write it yourself.

Otherwise you will get these sort of nasty incidents. [0] Even when 'all tests passed'.

[0] https://sketch.dev/blog/our-first-outage-from-llm-written-co...


LLMs (Sonnet, Gemini from what I tested) tend to “fix” failing tests by either removing them outright or tweaking the assertions just enough to make them pass. The opposite happens too - sometimes they change the actual logic when what really needs updating is the test.

In short, LLMs often get confused about where the problem lies: the code under test or the test itself. And no amount of context engineering seems to solve that.


I think in part the issue is that the LLM does not have enough context. The difference between a bug in the test or a bug in the implementation is purely based on the requirements which are often not in the source code and stored somewhere else(ticket system, documentation platform).

Without providing the actual feature requirements to the LLM(or the developer) it is impossible to determine which is wrong.

Which is why I think it is also sort of stupid by having the LLM generate tests by just giving it access to the implementation. That is at best testing the implementation as it is, but tests should be based on the requirements.


Oh, absolutely, context matters a lot. But the thing is, they still fail even with solid context.

Before I let an agent touch code, I spell out the issue/feature and have it write two markdown files - strategy.md and progress.md (with the execution order of changes) inside a feat_{id} directory. Once I’m happy with those, I wipe the context and start fresh: feed it the original feature definition + the docs, then tell it to implement by pulling in the right source code context. So by the time any code gets touched, there’s already ~80k tokens in play. And yet, the same confusion frequently happens.

Even if I flat out say “the issue is in the test/logic,”, even if I point out _exactly_ what the issue is, it just apologizes and loops.

At that point I stop it, make it record the failure in the markdown doc, reset context, and let it reload the feature plus the previous agent’s failure. Occasionally that works, but usually once it’s in that state, I have to step in and do it myself.


Not sure. If the Flash image output is $30/M [1] then that's pretty similar to gpt-image-1 costs. So a faster and better model perhaps but not really cheaper?

[1] https://developers.googleblog.com/en/introducing-gemini-2-5-...


Since I can't edit, it seems like Flash image is about 23% (4 cents vs 17 cents) of the cost of Openai gpt-image-1, if you're putting an image and prompt in and getting out, say, a 1024x1024 generated image. With the quicker production time that makes it interesting. Expecting Openai to respond at least in terms of pricing, e.g. a flat rate output cap price or something to be comparable.


Up here in Canada it's a question of trust, or rather the lack of it. Things are unlikely to ever go back to the way things were.

Buy, make and domestically develop drones, lots and lots of drones.


Canada should build their own air superiority fighter, with hookers and blackjack. They can call it the Avro Arrow.


With how generous Saab's JAS 39 bid is, I doubt there's much part for our own design: https://www.saab.com/markets/canada/gripen-for-canada/built-...



Maybe if there was some political will for building stuff but there isn't. Canada should be an absolute AI and energy powerhouse, but our politicians are some of the most incompetent buffoons on the planet.


I don't know enough about Canada to know if this is a reasonable take or not, but I think you'd get downvoted less if you took a few sentences to articulate what the politicians' main failings are.


That would take more than a few sentences, but in general there is a lack of willingness to build new infrastructure. Canada has endless opportunities to both export energy (and not just oil!) and use it domestically - we should be utilising this untapped potential to build datacenters and invest in AI companies and research. Instead we can't even build new houses or hospitals for our exploding population.


Also, there's Alberta.


I wish Alberta would diversify their industry but at least they have the right idea re. building and expanding our energy exports.


Indeed.

So the question becomes whether these countries truly want to move off of the platform, or if this is all more of a bargaining chip in the trade negotiations.

JD Vance pretty much single-handedly destroyed most trust in the US in with his speech at the Munich Security Conference. Europe (and probably Canada and Australia) were shaken for days after it and realized that the US is not a reliable ally (or even not an ally) anymore. This was confirmed by the disastrous meeting with Zelensky in the White House and the US stopping to provide intelligence to Ukraine and F-16 updates (F-16s which were provided by European countries, not the US).

The pathetic little show you saw at the White House last week (with Macron, Mertz, etc.) is just a strategy to appease the US as long as needed so that the Europe can speed up its own weapon's production, increase independence, etc. It's damage control. The reason countries have stopped buying the F-35 is because nobody trusts the US anymore. And one or two sane presidents are not going to fix it (the US elected Trump a second time after all).


It is interesting how it is basically an indictment on the ability of the american people to manage their hard and soft power and military capability. That being said, populist right wing movements are taking root in europe as well. This threatens long term strategic planning in general, not just with the US, when critical positions of world power are replaced every few years by a subset of the population increasingly liable to propaganda influence granted by technology. In some ways regimes like North Korea are the most stable on earth due to careful control of the reigns of power and lack of any possibility of inroads for third party influence.


It's crazy that you're acting like this is some kind of policy failure for the US, when this administration has been telling Europe it shouldn't rely on the United States at this level. This isn't some "gotcha" that you're describing, it's exactly what the administration wanted europe to do. Wake up and start innovating instead of being the Disneyland for American tourists.


Us Europeans are just baffled by the fact that this ‘administration’ wants this. The EU is a big economy that’s relatively easy to deal with. Why would you alienate us?

But yeah so far Trump has been relatively true to his word, as far as it goes. Not really practically but going further down the road of a dare I say fascist outlook. I think Europeans still can’t believe it’s happening, much less intentionally so.


[flagged]


You are effectively saying that Europe should be a vassal state to the US and cannot have its own laws. Europe has a different vision on privacy and competition. The regulation asks for e.g. Apple are peanuts compared to what China asks. Apple bends back over to please China, but if Europe has some requirements for doing business part of the US do the tired trope “US innovates, Europe regulates”.

We have too many problems at home to be daddy with a credit card.

First, this is rich for a country living on borrowed money (that they can only get away with because the rest of the world uses it as the default currency).

Second, a lot of the problems of the US are caused by the lack of proper wealth redistribution, lack of efficient health care (no, the US doesn’t subsidize European healthcare, European countries spend far less on healthcare with better outcomes). It’s not solved by throwing lifelong allies under the bus and trading the for some dictator friends.

Finally, the security situation also arisen because the US did not want European militaries to become too powerful and has pushed a lot to be dependent on the US and US tech. For instance, countries have to buy US fighters for nuclear sharing, etc. The primary exception is France because they never wanted to be reliant and have their own nuclear force, etc.

Also let’s not forget Article 5 was only invoked once (by the US) and we were happy to help, because that’s what friends do. We have been in Afghanistan for over 20 years as a result and a lot of our soldiers died and were injured.


To be fair, the GP claimed to have a credit card, not a salary.


> The fact that you can't understand this is exactly why this administration is doing this.

> because you believe that excellence is not worthy of being rewarded. Your culture has the mindset that excellence is not a product of hard work and determination, it's a product of luck and nepotism, so any hint of excellence gets taken away and diminished.

This administration does not believe in rewarding excellence, hard work, or determination. It’s an administration by the most malicious, incompetent people who have ever led this country.


I don't know why Europe wants so badly to be reliant on the US. It's bad for them, it's bad for us. It's embarrassing for Europe that Ukraine is relying on the US instead of Europe for defense. It's embarrassing for Europe how little they contribute to NATO. The US isn't a partner, it's a caretaker. And as they say, if someone provides what you need, they also have the power to take it away.

Outsourcing your defense is stuupiiid.

Europe should be thanking Trump for waking them up to the reality that has always been the case through his boorish negotiation.


Defense is a bit like advertising or finance. It has some aspects of a zero-sum game and a negative-sum game. All the money you invest in it is wasted. But if your enemy/competitor chooses to waste more money, you may be in trouble.

From an European perspective, the entire purpose of NATO from 1992 to 2022 was to prevent wasting too much money on defense. Because, for some reason, Americans were willing to do it instead.

Then Russia invaded Ukraine, and the calculus changed. Now European countries are rebuilding their defensive capabilities, while Russia is still bogged down in Ukraine. Given the lack of credible short-term threats, limiting defense spending was clearly the right choice until 2022.


Also it makes sense to have a capability only once within an alliance. If the US has the command, space and air capabilities, why would anyone else need to have this. You can add to their capability by buying F-35s and hosting their air bases.

Now that we are not allies anymore we need to wastefully build up our own command, space and air capabilties resulting in duplicated effort.


After Russia took Crimea and part of the east of Ukraine in 2014, I'm not sure that calculus was valid.


Yeah, spending also started increasing steeply since 2014 already, 2022 only accelerated it.

https://www.nato.int/nato_static_fl2014/assets/pdf/2024/6/pd...


>> Ukraine is relying on the US instead of Europe for defense.

Is it? Especially in 2025.

It is embarrassing how little of (very old) heavy equipment USA provided to Ukraine. North Macedonia provided same amount of main battle tanks as USA, Poland provided ten times more. And zero fighter jets.

Anyway, people of Ukraine are thankful for any support and USA was the biggest donor during first years of the war.


The US promised to protect Ukraine in the Budapest Memorandum, for which Ukraine had to give their nukes to Russia.

It's embarrassing for Europe that Ukraine is relying on the US instead of Europe

Europe has spent more on military aid to Ukraine than the US now.

https://www.ifw-kiel.de/publications/news/ukraine-support-tr...

Even though the US vowed to protect Ukraine in the Budapest Memorandum.

It's embarrassing for Europe how little they contribute to NATO

Before Trump, non-US NATO spent 425 billion and the US 654 billion:

https://www.nato.int/nato_static_fl2014/assets/pdf/2024/6/pd...

So it’s true that Europe/Canada spent less, but it comes with a bit fat asterisk that the US also wants to project power in the pacific/Asia, whereas European defense is primarily focused on avoiding Russian aggression (+ peace missions + supporting the US in various operations to give them more legitimacy).

Europe should be thanking Trump for waking them up to the reality that has always been the case through his boorish negotiation.

That credit should go to Putin, European spending has grown rapidly since the annexation of crimea.

The credit the Trump should get: stop buying US weapons as quickly as we can and focus on non-US alternatives. It’s going to take a while, but US material has certainly become less attractive.


> Before Trump, non-US NATO spent 425 billion and the US 654 billion:

And I bet a significant proportion of that 425 billion was spent on US weapons. I wonder if anyone has that number


Who is reliant on whom? USA is only member that actually used help of NATO.


The most fun one I've used is that it is my home environment in VR. In 3D it is a weird feeling to walk around and see how all the old sight lines are. I still duck a bit walking past mid doors :)

https://steamcommunity.com/sharedfiles/filedetails/?id=21021...


I had a mild addiction to this game about 7 or so years ago. Purely casual but lots of hours. I found it sort of a stress relief.

On the upside it gave me all sorts of free items as in-game 'drops'. I ignored them all at the time as didn't care about buying keys or cosmetics. Last year I saw that they'll worth a bunch of money now (!) and had about $1500 if sold on the steam marketplace. I got a Steam Deck with money from some of them, and it's basically my C:S 401k for steam games. What a weird world.


Can you pick thinking models with this or is that implied?

GPT-5 seems a bit slow so far (in terms of deciding and awareness). I’ve gone from waiting for a compiler, to waiting for assets to build to now waiting for an agent to decide what to do - progress I guess :)


For me most of the performance issues with MSFS24 are now being VRAM limited. When they went to MSFS 2024 they rewrote for DX12 and while doing that upgraded a few things to look nicer. The texture management still seems to need some work.

This means that my 9800x3D/3080Ti 12GB sort of runs out of VRAM and pages when used in VR or 4K desktop. I'm in the position where the same visuals (scenery/aircraft etc) for MSFS2020 (using DX11) when compared to the newer MSFS2024 is just generally worse and a lower framerate. In VR a bad framerate makes things unplayable. For desktop use you have DLSS which helps a lot, but in VR that blurry movement really impacts clarity.


DLSS also blurs the cockpit displays quite badly when there's anything moving on them (airspeed/altitude tape, etc.). It looks like temporal blur, which is interesting because the same blur doesn't happen with their TAA (*temporal* anti-aliasing) implementation.


Yeah, DLSS looks great outside the window and for people enjoying GA and VFR it's all good. For airliners with digital displays it is harder to use, as like you said it blurs. They did talk about using some form of stencil/exclusion around cockpit displays but I think that didn't go anywhere as yet.

As a team they have a pretty tough job because the audience is all of the place for a title like that, as in people with 'I can see my house!' being made happy vs 'My pressurization gauge cross-bleed reading is below the Boeing B738 manual official figure, unplayable'.


> They did talk about using some form of stencil/exclusion around cockpit displays but I think that didn't go anywhere as yet.

They've been talking about that since the launch of msfs2020 and there's not been any movement on it as far as I know.


Yep, that's the flight sim community in a nutshell.


I found DLSS4 Preset J much much improved, I prefer it to K.


That describes what I've seen. When I first compared 2020 and 2024 in as apples-to-apples a way as I could, it seemed like 2024's frame rate was about a third lower than 2020's. This was on a 7900 XTX with 24 gigs of VRAM.

I'm waiting for SU4 before I get back into it...


VR situation was much worse. On the hardware where 2020 was OKish, 2024 was unplayable.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: