> This is a trivial question. There's one correct answer and the reasoning to get there takes one step: the car needs to be at the car wash, so you drive.
I don’t think it’s that easy. An intelligent mind will wonder why the question is being asked, whether they misunderstood the question, or whether the asker misspoke, or some other missing context. So the correct answer is neither “walk” nor “drive”, but “Wat?” or “I’m not sure I understand the question, can you rephrase?”, or “Is the vehicle you would drive the same as the car that you want to wash?”, or “Where is your car currently located?”, and so on.
Yep, just a little more context and all/most of the models would do much better. And sure, most average+ intelligence adults whose first language is English (probably) don't need this, but they're not the target audience for the instructions :)
"The 'car wash' is a building I need to drive through."
or
"The 'car wash' is a bottle of cleaning fluid that I left at the end of my driveway."
The reason that those questions are asked, though, is that the answer to the actual question is obvious, so a human will start to wonder if it's some kind of trick.
Maybe that's a bias from training data. I would assume that most documents skip the "clarifying the question/scope" part of reasoning. Imagine a scientific text or even a book. Most will start with a clear context/scope. Either with a thesis or a well defined question or (in case of a book) with a story. Texts that start with a question that first needs to be refined are probably rare.
I wonder if anyone has any research on this field. I've often seen this myself (too often) where LLMs make assumptions and run off with the wrong thing.
"This is how you do <absolutely unrelated thing>" or "This is why <thing that actually exists already> is impossible!". Ffs man, just ask for info! A human wouldn't need to - they'd get the context - but LLMs apparently don't?
I think most people would say "drive?" and wonder when the punchline is coming, but (IMO) I don't think they'd start asking for clarification right away.
It feels more like a question on english linguistic conventions than logic.
If someone asked me the same question and I wanted to give a smartass reply, I'd tell them "You want to wash your car, good to know. Now, about your question, unless you tell me where you wanna go I can't really help you".
I agree. If the LLM were truly an intelligence, it would be able to ask about this nonsense question. It would be able to ask "Why is walking even an option? Can you please explain how you imagine that would work? Do you mean hand-washing the car at home, instead?" (etc, etc)
Real people can ask for clarification when things are ambiguous or confusing. Once something is clarified, they can work that into their understanding of how someone communicates about a given topic. An LLM can't.
Gemini's responses come very close to doing that when they make fun of the question (see other posts in the thread). If the model had been RL'ed to ask follow-up questions, it seems likely that it would meet your criterion.
This reminds me of a Uni exam that was soooo broken that answering “correctly” entailed guessing how exactly the professor designing the questions misunderstood the topic of his own lectures.
An interesting parallel to that is the "What's the next number in this sequence?" sort of questions.
If four numbers are provided, one can calculate the coefficients of a a quartic polynomial, for x values of 0, 1, 2 and 3, and then solve for x=4. Which does indeed provide a defensible "next number". And by similar reasoning, there are an infinite number of answers to this question.
Even worse. You could in fact provide any number as an answer, because there is always a quintic polynomial that fits the four initial numbers AND your arbitrary fifth number.
So these questions are actually not about what the next number is, but trying to imagine what the person who set the question thought was a "cool" answer, for some curious definition of "cool", for some person who isn't smart enough to realize that the premise on which the question is based is flawed.
Are you not allowed to ask the professor questions? We are, and it is not to seldom that the professor then walks to the black board and updates the question.
It was an examination with 300 students in a giant hall, overseen by university staff, not the individual professors.
So many people complained that they did eventually fetch him to come and clarify (correct) the questions.
I didn’t have the patience to wait for him to turn up, so I simply provided a matrix of solutions for every possible combination of potential original intent… with note next to it saying that anything other than a 100% mark will be met with official complaints about his lack of due diligence.
Agreed. It's also possible that "car wash" merely refers to soap they might use to do it themselves, and they're only going to buy it and then wash the car themselves at home. Imagine the same question but substitute "wash" for "wax" and it makes even more sense IMO.
That's a fair point, but if you would see it as a riddle, which I don't really think it is, and you had to answer either or, I'd still assume it's most logical to chose drive isn't it?
I don’t agree that the question as written would qualify as a riddle. If anything, the riddle is what the intention of the asker is. One can always ask stupid questions with an artificially limited set of answering options; that doesn’t mean it makes sense.
It is TOTALLY a stupid question, because OBVIOUSLY you should drive. It is based on the false premise that there is actually a choice. If somebody were to sincerely ask me this question, actually believing that walking was an option, I'm not sure I could resist the temptation to say "walk", just to see what happens next.
Only slightly evil, because the worst-case consequences are an unnecessary 100m walk. I think I could get that past an ethics committee, if I wanted to run an experiment to see what percentage of human responders would ACTUALLY walk to the car wash.
Thank you for saying this. It reminds me of class tests where you always had to wonder if something was a trick question and you never really knew... it was always after the teacher. Which frankly is fine in open-ended questions where you can explain your rationale or how different interpretations would lead you to different paths but a terrible situation when it comes to multiple choice. I remember being very frustrated by those
I don’t think it’s that easy. An intelligent mind will wonder why the question is being asked, whether they misunderstood the question, or whether the asker misspoke, or some other missing context. So the correct answer is neither “walk” nor “drive”, but “Wat?” or “I’m not sure I understand the question, can you rephrase?”, or “Is the vehicle you would drive the same as the car that you want to wash?”, or “Where is your car currently located?”, and so on.