1. Yes! This is useful for parsing unstructured data or inferring an argument (sometimes we can simply define a static data transformation through jq).
2. Anything too complex (e.g. 5+ steps) tends to be unreliable. Also, any workflow where potential failure/unexpected behavior is too risky to leave up to an LLM.
3. The only actions we take are with our user's tools, so many workflows are simply organizing their information between their apps. However, e.g. gmails could be sent externally so we have guardrails/sanity checks to mitigate risk there.
We do a fixed number of retries, including redoing any AI arguments. We've thought about making it atomic/more durable -- it's tricky, given that most steps interact with external systems e.g. Google Sheets, and while not typically "destructive" (Google Sheets has version history), undo-ing is often difficult.
2. Anything too complex (e.g. 5+ steps) tends to be unreliable. Also, any workflow where potential failure/unexpected behavior is too risky to leave up to an LLM.
3. The only actions we take are with our user's tools, so many workflows are simply organizing their information between their apps. However, e.g. gmails could be sent externally so we have guardrails/sanity checks to mitigate risk there.