A sidenote on that last question – it's a riddle and the correct answer is a cent, but I found the output from GPT-3 quite interesting nonetheless.
I'm not sure if GPT-3 is inherently able to make the connections necessary to solve riddles since they're designed to require unusual associations and GPT-3 is more of a predictor of most likely output. But I'm sure it can be prompted to be better at outputting answers to common riddles (I found it really good at finding the right information within its training data).
If someone could set GPT-3 up with a bunch of known riddles as a prompt, and then ask it an entirely novel riddle which it would have never trained on, I would love to see the result.
as another datapoint, I don’t think those riddles particularly good; their answers seem nonsensical, or in the case of the first two, even to posses qualities entirely opposite those the riddles require.
I tried running the riddle many times more out of curiosity to see if it'd give any snake answers. While it does seem to parse the sentence correctly, the logic in its output is often incoherent:
A: (A headless, tailless, brown creature that doesn't have legs probably isn't a valid concept, so I can't answer this.) I'm afraid that doesn't make any sense.
A: (A headless animal can't have a tail. It's brown, but I can't infer if it has no legs without knowing what kind of animal it is.) A headless brown animal has no legs.
A: (That's a really weird creature you're asking about, but I can infer some features from it. For example, it's small and has fur, but does not have any eyes.) That's an imaginary creature.
A: (I don't know what you're asking me. A head has a face, a tail has a butt and legs.) I'm afraid that doesn't make any sense.
A: (That's just nonsense. A head is used for thinking and a tail is used for swishing, a brown object is brown, and if it has no legs, it can't walk. Sorry, I can't answer that.) I don't understand.
A: (It's a fish, and also a brown fish with a tail and no legs.) It's a fish.
A: (A coffee cup doesn't have legs.) It's a coffee cup.
What struck me more about the last answer wasn't that GPT-3 didn't get the riddle, but that it couldn't interpret "no legs" as "zero legs" or similar, even in attempting to solve the riddle.
Since GPT-3 transforms sentences into high-dimensional vectors with connections between all the words and of course its internal weights trained on both the English language and all languages in general, I find that it has a basically flawless ability to interpret normal language.
I tested a prompt where the input was English and the output was Swedish, and it did a really good (but not always perfect) job at translating Swedish. I would assume the amount of Swedish in its training set is relatively tiny, so I find its ability to do that kind of translation in addition to all this other stuff really impressive. This also shows in its ability to translate really well between English and bash shell commands, or JSON data structures, or XML.
> Python implementation of zero-shot/generalized zero-shot evaluation framework proposed by Xian et al in Zero-Shot Learning - The Good, The Bad, The Ugly.
This is a wild guess- When GPT-3 got confused by "no legs", I don't think it's because it would have any trouble correctly interpreting "no legs" in other contexts. In this context it was searching in a semantic space set up by "a head, a tail, is brown", and the general form of a question looking for an animal. GPT-3 didn't find anything in the semantic space matching "no legs" so it assumed that was an error and corrected it. It's actually fairly close to how a human would solve the riddle, except when the human noticed nothing matched the explicit and implicit prompts, they would correct the implicit request for an animal, and then reinterpret the rest of the riddle.
GPT-3 generates output word by word. And it doesn't make distinctions between prompt words and words it has generated. Generated words are selected somewhat randomly (if temperature >0). So "the train of GPT-3 thought" can be easily derailed by unfortunate random selection of a single word.
When it generated "I don't know what you mean by", it had to continue with something relevant. In this case "no legs" was selected.
In other words GPT-3 doesn't have machinery to stop and think "What I'm rambling about?".
I wonder if brown would have been an issue, even without the animal context, as a search for object with these properties wouldn't have turned up the cent which is copper color
it'd be interesting to ask gp3 what color is a cent and then redo the riddle with that
Riddles have common features that make them readily identifiable as riddles, prompting a certain mode of thinking. Identifying and solving riddles doesn't seem inherently beyond the abilities of GPT-3.
Riddles use technically defensible but deeply unintuitive and counter-common-sense interpretations of words grouped together to reach 'clever' equally-non-commonsensical conclusions. Honestly, if it were an AI coming up with them and not a human, we'd just write them off as errors.
I have been playing with GPT-2 (while I wait for GPT-3 access) to generate quotes, and I was surprised how many keep the symmetry, wordplay and plot twist effect at the end. I would not be surprised to see that the riddles generated with it are very good.
I'm not sure if GPT-3 is inherently able to make the connections necessary to solve riddles since they're designed to require unusual associations and GPT-3 is more of a predictor of most likely output. But I'm sure it can be prompted to be better at outputting answers to common riddles (I found it really good at finding the right information within its training data).