We did quite a thorough benchmarking of various structured decoding providers in one of our papers: https://arxiv.org/abs/2501.10868v3 , measuring structured outputs providers on performance, constraint flexibility, downstream task accuracy, etc.
Happy to chat more about the benchmark. Note that these are a bit out of date though, I'm sure many of the providers we tested have made improvements (and some have switched to wholesale using llguidance as a backend)
I think @dcreater was asking how these various structee decoding providers compare with how pydantic ai handles structured output, i.e via tool calling, forcing the LLM to use a tool and its arguments are a json schema hence you read the tool call arguments and get a structured output.
Happy to chat more about the benchmark. Note that these are a bit out of date though, I'm sure many of the providers we tested have made improvements (and some have switched to wholesale using llguidance as a backend)