Interesting. What LLM model? 4o, o3, 3.5? I had horrible performance with earlie...

unsupp0rted · 2025-05-12T20:25:18 1747081518

Whichever the default free model is right now- I stopped paying for it when Gemini 2.5 came out in Google's AI lab.

4o, o4? I'm certain it wasn't 3.5

Edit: while logged in

icelancer · 2025-05-12T20:46:56 1747082816

> Whichever the default free model is right now

Sigh. This is a point in favor of not allowing free access to ChatGPT at all given that people are getting mad at GPT-4o-mini which is complete garbage for anything remotely complex... and garbage for most other things, too.

Just give 5 free queries of 4o/o3 or whatever and call it good.

pants2 · 2025-05-12T20:36:05 1747082165

If you're logged in, 4o, if you're not logged int, 4o-mini. Both don't score well on the benchmark!

askafriend · 2025-05-12T20:42:15 1747082535

This gets at the UX issue with AI right now. How's a normie supposed to know and understand this nuance?

unsupp0rted · 2025-05-12T21:12:51 1747084371

Or a non-normie. Even while logged in, I had no idea what ChatGPT model it was using, since it doesn't label it. All the label says is "great for everyday tasks".

And as a non-normie, I obviously didn't take its analysis seriously, and compared it to Grok and Gemini 2.5. The latter was the best.

unsupp0rted · 2025-05-12T21:07:20 1747084040

Added context: While logged in

maliker · 2025-05-12T20:41:40 1747082500

Might be worth trying again with Gemini 2.5. The reasoning models like that one are much better at health questions.

unsupp0rted · 2025-05-12T21:07:52 1747084072

Gemini 2.5 in AI Studio gave by far the best analysis

dgfitz · 2025-05-12T21:21:57 1747084917

I can’t believe you’re getting downvoted for answering the question about the next-token-predictor model you can’t recall using.

What is happening?