This is really cool! It's like Excel's goal seek but can also handle the case of arbitrary input cells. Goal seeek can only handle one input and one output cell.
But how do you handle the case where multiple variables can be changed? If multiple input cells is the key difference from Goal seek, i think some more rigor should be placed into the algorithm here
e.g. setting A1 + B1 and wanting the result to be 5. Currently it bumps both A1 and B1 equally. What's the thought process behind this?
> However, diving into a new caching approach without a deep understanding of our current system seemed premature
Love love love this - I really enjoy reading articles where people analyze existing high performance systems instead of just going for the new and shiny thing
I think it's pretty cool that the source code is all in a single file called app.js, and it's just doing simple DOM manipulations, no React, no minification, no libraries. I like to think it's just written like that too, a gigantic file that the author just iterates on.
And that's the "magic" that makes it so snappy and fast to load. I built a web-based game just like that and I am confident that my choice not to use any of the "modern web dev stack" is the reason I managed to hit my 60 FPS performance target on an iPhone 6s in 2024.
I'd have loved to, but because of its nature as a live game show with cash prizes, it required that the user signs up with their phone number, as a way to make botting and multi-account a bit harder. I know that wouldn't go over well with the audience here.
The game is no more, but maybe I could put together a little post talking about the more interesting problems I had to tackle, and showing the animations I am the most proud of. Maybe some day!
Thanks, and yes thats exactly how it happened. I just made it originally for myself and i like my own stuff to be fast. And then over the weekend i thought it could be nice to just publish it (with some win2000 theme over it)
Dhall has imports from URLs, much like Javascript. From their tutorial:
{- Need to generate a lot of users?
Use the `generate` function from the Dhall Prelude
-}
let generate = https://prelude.dhall-lang.org/List/generate
{- You can import Dhall expressions from URLs that support
CORS
The command-line tools also let you import from files,
environment variables, and URLs without CORS support.
Browse https://prelude.dhall-lang.org for more utilities
-}
let makeUser = \(user : Text) ->
let home = "/home/${user}"
let privateKey = "${home}/.ssh/id_ed25519"
let publicKey = "${privateKey}.pub"
in { home, privateKey, publicKey }
let buildUser = \(index : Natural) ->
{- `Natural/show` is a "built-in", meaning that
you can use `Natural/show` without an import
-}
makeUser "build${Natural/show index}"
let Config =
{ home : Text
, privateKey : Text
, publicKey : Text
}
in {- Try generating 20 users instead of 10 -}
generate 10 Config buildUser
Obv it depends, but one way to "solve" this it is to show an edit history, or at least the latest edit timestamp along with some visual indicator that the message was edited recently
Having dealt with this problem at work for several years now, I feel the pain of keeping different clients in sync - it's extremely difficult. Not sure if it's possible in Matrix, but consider having a message ID that increments by one on every message in a room. That lets the client know pretty quickly if there's a gap or a misordering.
Not really getting this point though:
The /sync API returns events in an order "according to the arrival time of the event on the homeserver".
The spec for /messages says it returns events "in chronological order. (The exact definition of chronological is dependent on the server implementation.)".
Why would those two return different results? When does the chronological order of two messages differ from the arrival time of the event on the homeserver?
What I think you're missing is that Matrix runs as a distributed system. There's no central authority to assign IDs to messages, and it's possible for a single group chat to run in a split-brain configuration if two homeservers lose connectivity to each other. When those homeservers reconnect, users connected to each one will see messages appear "in the past" which were sent by users on the other side of the split.
Perhaps I'm wrong about how Matrix works, but my understanding was that at least public rooms still had a "primary" homeserver, like for example I can connect to #debian:matrix.org from any number of federated servers but matrix.org is still where that room "lives".
If that understanding is correct, then IMO the answer is simply that the canonical timeline is what that server says it is. Poorly connected users or those on other servers experiencing issues or delays with federation may temporarily see a different sequence of events but once everyone's had a chance to sync back up the state should generally be what the primary server for the room saw it as.
Perhaps there should be some sort of flag for "this message has been reordered during a resync" that clients which initially had a different state due to whatever reason could store to make it clear what happened, and likewise if the central homeserver receives messages with a timestamp significantly off real time it could flag those messages as possibly having been received out of order while still displaying them in the order they were received.
AFAIK there is no primary homeserver. The human-readable name of a channel has a homeserver part, but this is only for discovery purposes and maps to an alphanumeric random ID.
Split-brain scenarios can be resolved using an odd number of nodes (or voters) to achieve a majority consensus to agree on the state of the system, stopping the services on the minority side to prevent conflicting operations. Once communication is restored, the stopped nodes can rejoin the cluster and synchronize their data. Vector clocks are a great abstraction for ensuring correct ordering as well.
yeah, having eventual consistency for messages across homeservers makes the work on the client harder. I guess they just have to accept that messages will "appear in the past" as you said.
But at least for messages sent within the same homeserver, I would think that those two apis should return the same data
I think you basically want a partial order for federated chat: messages should arrive after the messages that cause them but not necessarily after messages that didn’t cause them. In the case of a network partition, this allows people on either side of the partition to continue communicating at the cost of non-determinism when the partition is resolved.
I'd maintain that an important property is for the system to be eventually consistent with regards to history. You don't want a transient network event to potentially result in two users permanently seeing messages in a different order.
You can, but it results in the situation the article is complaining about.
During a netsplit, people chatting on opposite sides of the netsplit continue to be able to chat (by design), but will (obviously) see a different history from each other. So when the netsplit heals, you have a dilemma: either you splice the history from the other side in, giving eventual consistency at the cost of changing the history that people have already read, or you keep permanently different histories on servers that were on one side or the other.
You could put the other side into something that looks visually like a thread. Each side will have a different history. They will also have a marker that says the history was split here and click here to view the other side.
If you can come up with a good design for what a client that does that should look like, and what information it would need from the server to do that, please do write it up and publicise it. I think ultimately something like that has to be the solution, but it would have to be actually fleshed out into something that's possible to implement.
I'm pretty sure this is actually impossible in a distributed system with independent operation, and if it were possible, it would be terrible UI anyway.
Problem one is if you want to order events chronologically, you need to precisely decide what the time of the event means. Probably not the time the client hit send, because you can only measure that on the client and client clocks are at best approximately accurate. You could consider the time the server received it, and assume your server times are accurate, but that's still problematic because even in a well functioning system, if a user sends message A to server.wdc around the same time as a user sends message B to server.lax, users connected to server.wdc will get A then B, and users connected to server.lax will get B then A, and this leads to problem two:
Problem two is messages generally display in order of receipt. If you get a message that slots in earlier in the thread, you may need to scroll up to see it. In a busy theead, it's going to be hard to read all the messages because of the back and forth. If you send a message, it may need to be reordered too. If you go back to the thread later, new messages may be in different places. This is more disorienting IMHO than different message orders for different viewers.
Problem two gets even worse when you don't just have distance between servers, but also some network or other operational issues. If a server accepts a message, but is unable to forward it immediately, you probably want it to forward it whenever it can... if there's a significant delay, now the message is again going to be displayed in a place where it's difficult to see.
You can kind solve this by forcing messages to a group to go through a single queue which forces an ordering, but that makes accepting messages for a group a lot more difficult.
As long as the speed of light remains constant for all observers, who cares if everyone agrees on simultaneity? Distributed systems don't need to know what time it is, just what happened.
Well known systems are implemented this way, and the UI is great, people barely even notice.
The problem is that we don't just want to know what events happened, but also the order of events.
Between the posters in this group, we've got some ideals we'd like to meet
a) messages should be displayed as soon as possible when they arrive (this one isn't written much, but I think it's generally agreed)
b) messages should be displayed in a globally consistent order
c) message order should not change after display / newly arrived messages must be at the bottom/top
Unfortunately, we can't meet all these ideals unless instant messaging becomes actually instant instead of just pretty fast messaging subject to the speed of light and other sundry delays.
You can get all the properties if you serialize messages through some single queue somewhere, but of course that means additional delay and a spof.
You can most likely get b and c if you compromise a, and just don't display messages for long enough that you probably have everything and can order it according to the gloablly consistent ordering algorithm.
If you compromise b, you can definitely do a and c. Messages go to the end when received. Easy peasy. You can't leave placeholders for messages that will be sorted earlier but haven't arrived yet, because you generally won't know they exist until they arrive; although there are some cases where you could know. If a user receives message X and sends message Y, but due to delays and what not, you receive Y before X ... Y could indicate the presence of X, and a reciever who gets Y first could reserve a place for X... but that doesn't work for simultaneous messages.
If you compromise c, you can do a and b, messages are inserted into the ordered list on arrival based on the consensus ordering algorithm. Easy enough.
Of course, there's not widespread agreement on b and c. So half of the thread is people saying b is clearly non-negotiable, another half is saying c is clearly non-negotiable, and the other half is saying why can't we just have everything we want.
Give up on the idea of "the timestamp" of an event. There is no such thing. Clocks are unreliable, and even if every clock in a system is perfectly in-sync (via atomic transponders or whatever) they're still subject to speed-of-light discrepancies that make it impossible to define "the time" of any event.
Two nodes separated by 10000km require ~33ms to send information in one direction, and ~66ms to do a roundtrip. Send X=1 to node=A from a client that's 5ms away from A at client-local time T1, and then send X=2 to node=B from a client that's 4ms away from B at client-local time T1-1ms -- when were these values sent, and what is the value of X? There is no answer, X is both 1 and 2, depending on when and who you ask.
You can define a leader node C, which receives updates from child nodes A and B, and that leader node can serialize updates in a way that produces a single linearizable sequence of updates, sure. But then that sequence of updates as defined by C needs to be propagated to child nodes A and B, which takes (let's say) 66ms round-trip minimum. So when your client sends X=1 to node=A, it has to wait for at least 66ms before it can make a correct read from that same node -- X may actually be 2!
Logical ordering of events in a distributed system is a solved problem. The solution is vector clocks (or something like them).
> Logical ordering of events in a distributed system is a solved problem. The solution is vector clocks (or something like them).
Vector clocks don't solve this problem, as they only provide a partial ordering. When multiple messages are in flight in the same 'simultaneity window', different observers may receive them in different orders and vector clocks can't determine a consistent order of those events.
Vector clocks could be used determine the order is inconsistent. But what do you do with that information?
You might likely have follow on events where A and B are unordered, and A1 was sent only seeing A, B1 sent only seeing B, and C was sent seeing A and B without seeing A1 and B1. It only gets more complex from there.
I'm not a UX person, but I can't imagine how to show this to users without causing massive confusion and information overload. There's a very small set of people that have studied distributed communication that would get this. And I haven't seen any UI that shows similar information in a coherent way... maybe git graphs, but I don't see how you make that fit on a phone screen where you're having a group chat.
Maybe just some indicator that says other people may see these messages in a different order, but then if it's not an immediately obvious signal, it's not really going to help users understand.
> When multiple messages are in flight in the same 'simultaneity window', different observers may receive them in different orders and vector clocks can't determine a consistent order of those events.
That's right! Vector clocks only provide partial order. But partial order is the only actual truth in any distributed system. Total order is a fiction, which only exists in the context of a specific node, based on that specific node's experience of reality (message receipt).
In any non-trivial distributed system, there is no consistent (total) order of events, at least not without a consensus protocol. There are lots of ways to hack a (fake) total order, and many of those approaches are enormously successful nearly all of the time. But, still, you know.
I don't understand why you seem to agree with me but also said
> Logical ordering of events in a distributed system is a solved problem. The solution is vector clocks (or something like them).
A partial order solves the problem when messages are not simultaneous and so the partial order provides a total order. That there is in fact no total ordering of simultaneous messages doesn't solve the problem that users would like to have messages arrive in a consistent order without delay. This is unsolved, because it's unsolvable, therefore vector clocks aren't the solution.
Vector clocks provide a deterministic partial order, but partial order doesn't provide any kind of total order. That's true, yes.
But (as I'm sure you're aware) there is no single deterministic total order of events in a distributed system. The system can assert some specific total order, based on some specific criteria -- say, LWW based on node identity -- but that's system-specific and arbitrary.
That "users would like to have messages arrive in a consistent order without delay" is a nice and valid expectation, but is literally impossible, in the general case. Vector clocks solve a lower-level problem; nothing can solve the higher-level problem.
There are relatively straightforward decentralized consensus algorithms for ordering events if we assume cooperation. If we assume malicious peers, then we're in the space of the byzantine generals problem, but there are solutions to that too.
Now there's some property that you have to give up, for example an immutable ordering. You might think the message came in one order, then reconnect with the network and discover the order was flipped. So long as the UI can handle that an update, there are consensus algorithms that will deliver a consistent view even in the edge cases.
I feel like when an important message comes in out of sequence, but you had already sent a response to the chat with what was visible at the time, it will be very confusing when that gets reordered.
Ex:
A@T0: User X is abusing our service, we should send them a sternly written letter.
B@T60: Yes, I'll do it right away.
C@T2 (received later): No, we should just shadowban them.
When B sent their message, their intent was clear to them. But when they review their message after C's message is received, if the display ordering is changed, the meaning of the communication has changed, and how can B show that sending the warning was reasonable when they clearly said they were going to shadowban the user. (Maybe this group should use something else with a guaranteed ordering to track abuse and response, but that's a different question)
If C's message is displayed earlier than B's in some cases and not others, that makes for a confusing situation, but each person can look at their messages and easily see what they saw when they argue about a breakdown in communication in the aftermath.
What's mysterious here? One ordering is dictated (arrival time), another left for the consideration of the server (likely allowing for stuff like pinned messages, etc, that break the strict ordering).
If a Matrix server allows to delete messages (by the poster or by a moderator), then increasing IDs with no gaps become impossible. If the server allows editing of existing messages, then a sequence with no gaps is not sufficient to reflect all changes. Ideally a server does not do either, but uses more messages to augment existing messages, or mark some as deleted; with that, a sequence with no gaps would suffice.
I really wish there was another platform than the app formerly known as Twitter, where AI discourse is thriving. Because right now, it seems that AI papers, discussions and articles are all being shared there first. Threads might be a competitor for that but I don't see that happening anytime soon.
But how do you handle the case where multiple variables can be changed? If multiple input cells is the key difference from Goal seek, i think some more rigor should be placed into the algorithm here
e.g. setting A1 + B1 and wanting the result to be 5. Currently it bumps both A1 and B1 equally. What's the thought process behind this?