I like to use uuid5 for this. It produces unique keys in a given namespace (defined by a uuid) but also takes an input key and produces identical output ID for the same input key.
This has a number of nice properties:
1. You don’t need to store keys in any special way. Just make them a unique column of your db and the db will detect duplicates for you (and you can provide logic to handle as required, eg ignoring if other input fields are the same, raising an error if a message has the same idempotent key but different fields).
2. You can reliably generate new downstream keys from an incoming key without the need for coordination between consumers, getting an identical output key for a given input key regardless of consumer.
3. In the event of a replayed message it’s fine to republish downstream events because the system is now deterministic for a given input, so you’ll get identical output (including generated messages) for identical input, and generating duplicate outputs is not an issue because this will be detected and ignored by downstream consumers.
4. This parallelises well because consumers are deterministic and don’t require any coordination except by db transaction.
This doesn't sound good at all. It's quite reasonable in many applications to want to send the same message twice: e.g "Customer A buys N units of Product X".
If you try to disambiguate those messages using, say, a timestamp or a unique transaction ID, you're back where you started: how do you avoid collisions of those fields? Better if you used a random UUIDv4 in the first place.
You don’t generate based on the message contents, rather you use the incoming idempotent id.
Customer A can buy N units of product X as many times as they want.
Each unique purchase you process will have its own globally unique id.
Each duplicated source event you process (due to “at least once” guarantees) will generate the same unique id across the other duplicates - without needing to coordinate between consumers.
Used an LLM to help write the following up as I’m still pretty scattered about the idea and on mobile.
——
Something I’ve been going over in my head:
I used to work in a pretty strict Pivotal XP shop. PM ran the team like a conductor. We had analysts, QA, leads, seniors. Inceptions for new features were long, sometimes heated sessions with PM + Analyst + QA + Lead + a couple of seniors. Out of that you’d get:
- Thinly sliced epics and tasks
- Clear ownership
- Everyone aligned on data flows and boundaries
- Specs, requirements, and acceptance criteria nailed at both high- and mid-level
At the end, everyone knew what was talking to what, what “done” meant, and where the edges were.
What I’m thinking about now is basically that process, but agentized and wired into the tooling:
- Any ticket is an entry point into a graph, not just a blob of text.
- Epics ↔ tasks ↔ subtasks
- Linked specs / decisions / notes
- Files and PRs that touched the same areas
- Standards live as versioned docs, not just a random Agents.md:
- Markdown (with diagrams) that declares where it applies: tags, ticket types, modules.
- Tickets can pin those docs via labels/tags/links.
- From the agent’s perspective, the UI is just a viewer/editor.
- The real surface is an API: “given this ticket, type, module, and tags, give me all applicable standards, related work, and code history.”
- The agent then plays something like the analyst + senior engineer role:
- Pulls in the right standards automatically
- Proposes acceptance criteria and subtasks
- Explains why a file looks the way it does by walking past tickets / PRs / decisions
So it’s less “LLM stapled to an issue tracker” and more “that old XP inception + thin-slice discipline, encoded as a graph the agent can actually reason over.”
Has any project tried forcing a planning layer as //TODO all throughout the code before making any changes? small loops like one //TODO at a time? What about limiting changes to a function at a time to remain focused? Or is everyone a slave to however the model was designed and currently they are designed for giant one-shot generations only?
Is it possible that all local models need to be better is more context used to make simpler smaller changes at a time? I haven't seen enough specific comparisons of how local models fail vs the expensive cloud models.
Rust is the most defect-free language I have ever used.
I'd wager my production Rust code has 100x fewer errors than comparable Javascript, Python, or even Java code.
The way Result<T,E>, Option<T>, match, if let, `?`, and the rest of the error handling and type system operate, it's very difficult to write incorrect code.
The language's design objective was to make it hard to write bugs. I'd say it succeeded with flying colors.
Now try an actual functional programming language. I like Rust too but those features all come from FP, and FP languages have even more features like that that Rust doesn't yet or can't have.
> having people spend time prepping good commits is potentially time wasted if nobody ever looks at the PR commit history
Good habits make good engineers.
You never know which of your commits will cause a future problem so structuring all of them well means that when you need to reach for a tool like git bisect then your history makes it easy to find the cause of the problem.
This has a number of nice properties:
1. You don’t need to store keys in any special way. Just make them a unique column of your db and the db will detect duplicates for you (and you can provide logic to handle as required, eg ignoring if other input fields are the same, raising an error if a message has the same idempotent key but different fields).
2. You can reliably generate new downstream keys from an incoming key without the need for coordination between consumers, getting an identical output key for a given input key regardless of consumer.
3. In the event of a replayed message it’s fine to republish downstream events because the system is now deterministic for a given input, so you’ll get identical output (including generated messages) for identical input, and generating duplicate outputs is not an issue because this will be detected and ignored by downstream consumers.
4. This parallelises well because consumers are deterministic and don’t require any coordination except by db transaction.
reply