Not sure if that fits the bill, but here is an example with 200 sorted items bas...

sebastiennight · on April 13, 2024

There are a few improvements I'd suggest with that prompt if you want to maximise its performance.

1. You're really asking for hallucinations here. Asking for factual data is very unreliable, and not what these models are strong at. I'm curious how close/far the results are from ground truth.

I would definitely bet that outside of the top 5, numbers would be wobbly and outside of top... 25?, even the ranking would be difficult to trust. Why not just get this from a more trustworthy source?[0]

2. Asking in French might, in my experience, give you results that are not as solid as asking in English. Unless you're asking for a creative task where the model might get confused with EN instructions requiring an FR result, it might be better to ask in EN. And you'll save tokens.

3. Providing the model with a rough example of your output JSON seems to perform better than describing the JSON in plan language.

[0]: https://fr.wikipedia.org/wiki/Liste_des_communes_de_France_l...

thibaut_barrere · on April 14, 2024

Thanks for the suggestions, appreciated!

For some context, this snippet is just an educational demo to show what can be done with regard to structured output & data types validation.

Re 1: for more advanced cases (using the exact same stack), I am using ensemble techniques & automated comparisons to double-check, and so far this has really well protected the app from hallucinations. I am definitely careful with this (but point well taken).

2/3: agreed overall! Apart from this example, I am using French only where it make sense. It make sense when the target is directly French students, for instance, or when the domain model (e.g. French literature) makes it really relevant (and translating would be worst than directly using French).

sebastiennight · on April 14, 2024

Ah, I understand your use case better! If you're teaching students this stuff, I'm in awe. I would expect it would take several years at many institutions before these tools became part of the curriculum.

thibaut_barrere · on April 14, 2024

I am not directly a professor (although I homeschool one of my sons for a number of tracks), but indeed this is one of my goals :-)