You could have a multi agent harness that constraints each agent role with only the needed capabilities. If the agent reads untrusted input, it can only run read only tools and communicate to to use. Or maybe have all the code running goin on a sandbox, and then if needed, user can make the important decision of effecting the real world.
A system that tracks the integrity of each agent and knows as soon as it is tainted seems the right approach.
With forking of LLM state you can maintain multiple states with different levels of trust and you can choose which leg gets removed depending on what task needs to be accomplished. I see it like a tree - always maintaining an untainted "trunk" that shoots of branches to do operations. Tainted branches are constrained to strict schemas for outputs, focused actions and limited tool sets.
LLMs are already a powerful tool for serious math researchers, just not at the level of "fire and forget", where they would completely replace mathematicians.
I've used Sorbet a lot but don't really count it. I understand why others would but I find the type system is extremely shallow and limited and the overhead it adds to development (and even performance) is substantial.
Also Ruby has RBS now which is not inline and... much maligned to say the least. I think the entire ecosystem is at a crossroads rn wrt typed Ruby
No worries, I also think it deserves a bit more highlight, especially to those who are against having rbs as separate file and to those who despise the Sorbet DSL in Ruby. The plan with Rbs-inline is to merge with rbs-gem so it will come included in Rbs!
reply