More

JackHopkins · on Feb 5, 2024

Cool! why is it crucial to keep old versions of code, and what are the risks of running outdated code?

p10jkle · on Feb 5, 2024

If don't keep old code around, you are forced to either drop pending requests that started on them, or try and replay them on the updated code, which can't be known to be safe. So it's better to keep it around, but this brings in a new set of infra and security challenges, eg where will it run, what will it cost, will it have a vulnerable dependency, etc

JackHopkins · on Feb 5, 2024

Can't you replay pending requests? How can you mitigate the issue of differing side effects generated by the new / old versions?

p10jkle · on Feb 5, 2024

if you replay the request on the new version then you might encounter new steps that don't match what you have in the journal. temporal users know this pain well...

tveita · on Feb 5, 2024

Rolling a task over to a new version should be "safe" in that you can detect conflicts and roll back if the sequence of calls does not match the old version.

For a post about "solving" durable execution I would expect both a scale-to-zero way to keep older versions around indeterminately - I guess the Lambda based approach does qualify - and a safe and controlled way to upgrade task versions iff the execution history is compatible.

pavel_pt · on Feb 7, 2024

How do you imagine detection of conflicts working?

tveita · on Feb 7, 2024

Each execution by design has a record of all calls with side-effects, with input and output.

If you replay history up to the newest call and all calls are identical, that specific execution instance is compatible with the new code and can be upgraded. If not it should be rolled back, and you can either deploy a fixed version of the code with backwards compatibility, or delete executions that can not be upgraded.

Backwards compatible code can be written as

  if (workflowVersion() >= FIX_VERSION) new_way() else old_way()

There should be two ways to get the version for backwards compatibility: workflowVersion() is replayed and can change between side effect calls, e.g executions will use the old retry logic until they reach the current point in time, when they will switch over to the new one.

originalWorkflowVersion() is constant, e.g. all executions that started before NEW_TAX_RULE will keep using the old tax rules for all calculations.

whoiskatrin · on Feb 5, 2024

also would be great to see a deep comparison with Temporal and AWS Func

pavel_pt · on Feb 5, 2024

Are there any aspects in particular that would be of interest?

whoiskatrin · on Feb 5, 2024

I'd love a deep technical comparison, but it would also be great to understand if Restate is better than Temporal for specific use cases and vice versa. When someone should choose one of them over another

JackHopkins · on Dec 14, 2023

This is really cool, nice work! How does this differ from Apollo Federation? I'm a bit confused here, because you have integrations with AF too - is this competitive, or is this a more vertically-integrated solution? Cheers!

fbjork · on Dec 14, 2023

Hi Jack,

Founder of Grafbase here.

Grafbase Federated Graphs are spec compliant with Apollo Federation. We've invested a lot in the developer experience of building an deploying GraphQL APIs. Local development, the Grafbase SDK and the Grafbase dashboard were built from the ground up to be easy and efficient to use. GraphQL APIs deployed to Grafbase are also deployed to the edge by default, but can now also be deployed in your own infrastructure.

JackHopkins · on Dec 14, 2023

Thanks - what is your sweet spot use-case do you think? I've built a few GraphQL projects in the past, so I'm curious where you fit into the ecosystem?

JackHopkins · on Dec 11, 2023

This is really cool! Am I right in thinking that the cost for running this program is equivalent to all of the dependency execution durations? i.e no busy waiting?

sewen · on Dec 11, 2023

To a first approximation, yes. There is some small cost for the workflow function itself, but as this doesn't wait on responses and only really executes the side effects, it is not that much. Especially, given that this has comparable semantics to the accurate mode (not the express mode) of StepFunctions (which is charged by number of state transitions and not super cheap).

JackHopkins · on Dec 12, 2023

What is the order of the cost of the workflow function?

p10jkle · on Dec 11, 2023

yep no busy waiting for calls and sleeps within the system. However, 'sideEffects' where you basically just commit the result of an external operation still block the Lambda. But we're thinking about exposing a `fetch` api directly from the runtime (ie, we do a call on your behalf) that could in theory sort that out

JackHopkins · on Nov 28, 2023

Hey everyone,

I'm a main contributor of Tanuki (formerly MonkeyPatch).

The purpose of Tanuki is to reduce the time to ship your LLM projects, so you can focus on building what your users want instead of MLOps.

You define patched functions in Python using a decorator, and the execution of the function is delegated to an LLM, with type-coercion on the response.

Automatic distillation is performed in the background, which can reduce the cost and latency of your functions by up to 10x without compromising accuracy.

The real magic feature, however, is how you can implement alignment-as-code, in which you can use Python's `assert` syntax to declare the desired behaviour of your LLM functions. As this is managed in code, and is subject to code-review and the standard software-lifecycle, it becomes much clearer to understand how an LLM feature is meant to behave.

Any feedback is much appreciated! Thanks.

JackHopkins · on Nov 16, 2023

Good to know, we'll make it more clear in the docs! To answer regarding these 2 areas,

1) The data for finetuning currently is saved on disk for low latency reading and writing. Both test statements and datapoints from the function execution are saved to the dataset. We also are aware that saving to disk is not the best option and limits many use-cases so we're currently working on creating persistence layers to allow communication with S3 / Redis / Cloudflare as the external data storage.

2) Currently starting the fine-tuning job happens after the dataset has at least 200 datapoints from GPT-4 executions and align statements. Once the finetuning is completed, the execution model for the function is automatically switched to the finetuned GPT 3.5 turbo model. Whenever the finetuned model breaks the constraints, the teacher (GPT4) is called upon to fix the datapoint and this datapoint will be saved back to the dataset for future iterative finetuning and improvements. We are also working on adding in ways for the user to include a "test-set" which could be used to evaluate if the finetuned model achieves the required performance before switching it as the primary executor of the function

Hope this makes it more clear, if you have any additional questions, let me know!

lamroger · on Nov 18, 2023

dope yea that's awesome!

JackHopkins · on Nov 16, 2023

Do any other names jump out at you as preferable?

JackHopkins · on Nov 16, 2023

The IDEs shouldn't complain if the function has a docstring (which all the MP functions should have as that's the instruction that is executed) and the @patch decorator, atleast the ones we have tried it with have liked the syntax in that sense so far. But adding a "pass" is also permissible if the IDE does complain

JackHopkins · on Nov 16, 2023

PyMonkeyPatch? MonkeyPatch.py?

I would quite like a short and distinctive name!

JackHopkins · on Nov 16, 2023

The big one is a Typescript implementation. Other than that, the plan is to support other models (e.g Llama) that can be fine-tuned.

Finally, other persistence layers like S3 and Redis, to support running on execution targets (like AWS Lambda and CloudFlare workers) that don’t have persistent storage.

I think it could be really interesting to support Vercel more tightly too. We currently support Vercel with Python, but I think Typescript + Redis would really enable serverless AI functions - which is where I think this project should go!

JackHopkins · on Nov 16, 2023

The python package (and repo) is called ‘monkeypatch.py’ for the avoidance of confusion.

ipsum2 · on Nov 16, 2023

Calling my library "listcomprehension.py" doesn't really avoid confusion. In fact, `pip install monkey-patch.py` looks downright odd.

JackHopkins · on Nov 16, 2023

Yeah I definitely agree on the latter point, it does look odd. PyMonkeyPatch?

ipsum2 · on Nov 16, 2023

I don't think you grasp the point.

JackHopkins · on Nov 17, 2023

I understand the point. I would ideally like an association with monkey-patching something as that is relevant to the behaviour of the package. However, not so similar that it shadows the technique of monkey-patching!

apstls · on Nov 17, 2023

LlamaPatch? Once open source model support is added of course :)