You can ask a model to provided an analysis of its answer including a probabilit...

wildrhythms · on April 28, 2024

Is there any evidence that these probabilities are based on any real calculated heuristic?

CuriouslyC · on April 28, 2024

They're consistent to the model, particularly if you ask the model to rationalize its rating. You will get plenty of hallucinated answers that the model can recognize as hallucinations and give a low rating to in the same response.

sehro · on April 28, 2024

If the model can properly and consistently recognize hallucinations, why does it return said hallucinations in the first place?

CuriouslyC · on April 28, 2024

Models can get caught by what they start to say early. So if they model goes down a path that seems like a likely answer early on, and that ends up being a false lead or dead end, they will end up making up something plausible sounding to try and finish that line of thought even if it's wrong. This is why chain of thought and other "pre-answer" techniques improve results.

Because of the way transformers work, they have very good hindsight, so they can realize that they've just said things that are incorrect much more often than they can avoid saying incorrect things.

chasd00 · on April 28, 2024

You’re right back at square one hoping you can trust the analysis is correct.

CuriouslyC · on April 28, 2024

No, you absolutely are not. It's like an extra bit of parity, so you have more information than before.

chasd00 · on April 29, 2024

Does that extra information come from a separate process than the LLM network? If not then, assuming the same output is not guaranteed from the same input as per usual, then all bets are off correct?

CuriouslyC · on May 1, 2024

Sorry for the late reply, but if you read this, there is research that shows that prompting a LLM to take variety of perspectives on a problem (IIRC it was demonstrated with code) then finding the most common ground answer improved benchmark scores significantly. So, for example if you ask it to provide a brief review and likelihood of the answer, and repeat that process from several different perspectives, you can get some very solid data.