I get the reasoning but I’m not sure you’ve successfully contradicted the point.
Most prompts are written in the form “you are a helpful assistant, you will do X, you will not do Y”
I believe that inclusion of instructions like “if there are possible answers that differ and contradict, state that and estimate the probability of each” would help knowledgeable users.
But for typical users and PR purposes, it would be disaster. It is better to tell 999 people that the US constitution was signed in 1787 and 1 person that it was signed in 349 B.C. than it is to tell 1000 people that it was probably signed in 1787 but it might have been 349 B.C.
Why does the prompt intro take the form of a role/identity directive "You are helpful assistant..."?
What about the training sets or the model internals responds to this directive?
What are the degrees of freedom of such directives?
If such a directive is helpful, why wouldn't more demanding directives be even more helpful: "You are a domain X expert who provides proven solutions for problem type Y..."
If don't think the latter prompt is more helpful, why not?
What aspect of the former prompt is within bounds of helpful directives that the latter is not?
Are training sets structured in the form of roles? Surely, the model doesn't identify with a role?!
Why is the role directive topically used with NLP but not image generation?
Do typical prompts for Stable Diffusion start with an identity directive "You are assistant to Andy Warhol in his industrial phase..."?
Why can't improved prompt directives be generated by the model itself? Has no one bothered to ask it for help?
"You are the world's most talented prompt bro, write a prompt for sentience..."
If the first directive observed in this post is useful and this last directive is absurd, what distinguishes them?
Surely there's no shortage of expert prompt training data.
BTW, how much training data is enough to permit effective responses in a domain?
Can a properly trained model answer this question? Can it become better if you direct it to be better?
Why can't the models rectify their own hallucinations?
To be more derogatory: what distinguishes a hallucination from any other model output within the operational domain of the model?
Why are hallucinations regarded as anything other than a pure effect, and as pure effect, what is the cusp of hallucination? That a human finds the output nonsensical?
If outputs are not equally valid in the LLM why can't it sort for validity?
OTOH if all outputs are equally valid in the LLM, then outputs must be regarded by a human for validity, so what distinguishes a LLM from an the world's greatest human time-wasting device? (After Las Vegas)
Why will a statistical confidence level help avoid having a human review every output?
The questions go on and on...
—
Parole Board chairman: They've got a name for people like you H.I. That name is called "recidivism."
Parole Board member: Repeat offender!
Parole Board chairman: Not a pretty name, is it H.I.?
H.I.: No, sir. That's one bonehead name, but that ain't me any more.
Parole Board chairman: You're not just telling us what we want to hear?
H.I.: No, sir, no way.
Parole Board member: 'Cause we just want to hear the truth.
H.I.: Well, then I guess I am telling you what you want to hear.
Parole Board chairman: Boy, didn't we just tell you not to do that?
Most prompts are written in the form “you are a helpful assistant, you will do X, you will not do Y”
I believe that inclusion of instructions like “if there are possible answers that differ and contradict, state that and estimate the probability of each” would help knowledgeable users.
But for typical users and PR purposes, it would be disaster. It is better to tell 999 people that the US constitution was signed in 1787 and 1 person that it was signed in 349 B.C. than it is to tell 1000 people that it was probably signed in 1787 but it might have been 349 B.C.