I'm an EEE, and one of the strange things I've noticed is that, for whatever reason, people from a CompSci background is really, really have trouble building and understanding even simple state machines.
I suspect this unintuitiveness is the main reason why we don't see more state machines, as even in the case where they're an optimal solution many SWEs will shy away from them.
Locally-rendered UIs are probably one of the best examples of where state machines shine. State-machine based ImGuis lower complexity by orders of magnitude, yet, people still resist.
In software, it’s rare that two states can be transitioned between synchronously. Or at least, modeling a problem in terms of states that have synchronous transitions is hard. Waiting for user input, making a network request, disk IO, etc. are all asynchronous. This makes it annoying to design a state machine.
Just as an example, imagine you have two states you want to model, A and B. Let’s say the transition between A and B requires 3 async operations. You then need 8 total states: A, A_before_step_1, A_before_step_2, A_before_step3, B, B_before_step_1, B_before_step_2, and B_before_step_3. It just becomes totally unmanageable.
> This makes it annoying to design a state machine.
Annoying sure, but necessary right? With or without a state machine you have to handle all the states, and if you don't bother quantifying all possible states of your model and just winging it, it's not like those extra transition states magically disappear. Instead you just run into weird multithreading bugs and after banging your head against the wall for 5 hours you realize you forgot about a transitory state
And just to clarify, I don't mean handling every single state possible, because that's most likely infinite. But if you specify a finite number of states your app can be in, and then mark the invalid transitions to take you to an invalid state, then you can figure out when your app is in an invalid state immediately instead of having the weird bugs crop up randomly.
No, not necessary. The key insight here is that the transitions are the hard part. One way to solve this is to write your transitions as a series of asynchronous steps that feed into each other. The end result is an asynchronous chain of operations.
E.g. if you have a video app, playing a video might look like:
1) Read the user metadata from disk.
2) Use the user metadata to make a network request for the video metadata.
3) Deserialize the video metadata on a background thread.
4) Present the video player with the video metadata on the main thread.
You compose each step into a chain of operations. Each step in the chain contains async work as well as logic that “cleans up” the work. So when you want to play a video, your code says “start the complicated video chain.” When you want to stop the video, your code says “stop the video chain.” The chains are a generic sequence of steps, so they can be stopped in the middle. Each step contains its own cleanup logic, so the cleanup of any pending work can happen automatically upon stopping the chain.
tldr: “state” the user is in at any given time is so complex that it’s impossible model with a state machine. It’s easier to think of cascading asynchronous steps instead. A good library for writing this kind of code is called Rx.
It's been a while since I wrote any Rx in earnest, but when I did, it was useful to look at it as almost exactly what you'd end up if you incorporated time (via ordered streams/publishers/observers/etc.) into a model that represents a state machine model.
That is to say, you can kinda both be right here. ;)
This is one that you can actually get a lot better mileage by introducing a new state machine, oddly. You have the states A and B, but you also have the state machine for the transition that is T1, T2, and T3. In this world, you have 5 things you need to be able to render/model. A, B, T1, T2, and T3. That the T things will be modeled/rendered inside either A or B can usually be irrelevant.
Even better, if you picked the right abstraction for the T state machine, that same set of modeling can be used elsewhere. (Note, no panacea. You can still pick the wrong abstractions here. Or one that just doesn't help you later, even if it isn't wrong.)
And this is exactly how it is in the physical world. The "payment acceptor" state machine of a vending machine is independent of the vending aspect of the vending machine. It is there solely to create the signal of "payment received" to the machine. Whether that comes from coins, credit cards, phone taps, whatever.
Those regexes that are ubiquitous are called regular expressions because they describe “regular languages” (qv), which are recognized by nondeterministic finite automata, which are converted to deterministic finite automata by fun with powersets, which are compiled into software state machines behind the abstraction of the regex libraries.
So they are not at all rare but not exposed to programmers explicitly.
Almost all "regexes" in programming languages are not actually regular expressions and do not have the properties you describe. (I'd argue that this is one of the clearest demonstrations that programmers do not put a lot of value of state machines)
Yes definitely a different thing when talking about ‘extended regex’ libraries, wrt to the regular languages (language class) and associates recognizers.
Surely only if you need to name each state and define every transition. However a sparse representation would be useful here right?
If there are fundamentally transitions that are impossible (or errors) then just only define the transitions that make sense and have the rest map to a error condition if they ever come up.
Also, if the state space is factorisable (like in your example) then it is easy to represent. You never need to name every state separately if e.g. you use 2 enums (or more generally a GADT) instead of trying to fit it all into 1 variable for every possible state.
No, you need to represent every state. Like let’s say there are two async operations: a disk read, then a network request. When you transition, you need to know: did I start the disk read? If yes, cancel the disk read if it’s pending. Did the network request start? If it started but didn’t finish, then cancel it. So you can see that there is a huge amount of state you need to operate on that’s not directly captured by states A and B.
If you need recoverability from every state to A or B, then it’s not a 2-state (with intermediates) problem and you do need to go through the complexity of modelling all 6 states.
Also, the full state representation is not the problem, that’s easy enough to do with a reasonable type system. The problem is defining the transitions. But if you need the full transition matrix (or even any substantial subset), then you have a lot of things to do. But it’s things you’d have to do with a non-state machine based approach.
Realistically the only overhead with a sparse state machine representation is adding is the “catch-all” error branch you need to deal with any forbidden transitions. Everything else, you’d need to deal with anyway for normal operation. And you probably want the catch-all branch anyway, for debuggability when it breaks.
Yeah, I’ve wrestled with this, too. Usually I’ve compromised on having sub-states (Swift enums with associated types alleviate some of the pain), but still doesn’t feel quite right.
Finite state machines are an ideal mechanism to with concurrency if you put a concurrent queue in front of them. Such a program is easily coded with Akka's FSM trait. It even adds the possibility to trigger transition based on timer. As soon as you are in the realm of time sensitive programs, i.e. because of network requests, this is a very powerful and easy to understand way to program complicated parts of programs.
I would just consider all those intermediary states as one “loading” state. It is pretty simple to have a “loading” and “error” state in addition to your other ones.
If you do that then you can't pipeline the response handling and you can't show loaded data until you fully transition to B. Any error would equate to every error.
You could have your loading state handle every little detail but then what was the point of the state machine anyway?
Can’t you still just model A and B, and have your transitioning logic compute and wait for whatever is necessary to make the full transition? If you keep the ephemeral stuff in aux memory, you can have keep your number of states more manageable.
No, because resources are limited. Like if a user indicates they want to go from A to B, and then half way thorough the transition decides they want to go back to A, then you need to cancel and clean up everything for the transition to B. Otherwise you’re wasting battery, network bandwidth, main thread cycles, etc. on useless work.
They do not scale well. It is why Elm-style explicit message passing UIs are not popular. State machines work for state localised to a widget, but the moment you scale up to something moderately complex (spreadsheets, datagrids, forms etc.), your diagrams balloons in size. For non safety-critical applications, it is better to have implicit state and potential bugs than to have your engineering team waste time proving the correctness of a button when they could be building features instead.
SWEs (at least those with formal CS training) understand state-machines perfectly well. It is the kernel of regular expressions and parsing. Mapping it to a product UI just doesn't make sense all the time.
> but the moment you scale up to something moderately complex (spreadsheets, datagrids, forms etc.), your diagrams balloons in size
I'm no CS major. But isn't that what Hierarchical state machines are supposed handle? Or was that all hat no cattle and/or a lost knowledge when UML style stuff got tossed away?
Hierarchical state machine = object-oriented programming in the SWE world. You'd have one state machine in each object. It's just rare to see that because even the individual objects tend to have practically non-finite states, like counters and stuff, and just be too complex in general for FSMs.
Maybe there's some math-ey way of saying everything we do is an FSM, but I don't think of it that way.
Having worked on a large hierarchical state machine implementation (using statecharts), I can say that nested FSMs help a bit, but are still nowhere close to capturing the necessary complexity. 80% hat, only 20% cattle.
I believe the intuition of most programmers tends to stem from the development environments being designed around ease of mutable state. If you want to "add state" you just grab some variable and branch on it, done.
Thus all the instances where you want to enumerate your state come off as seeming unnecessarily formal, so the code just drives straight towards a ball of mud and the programmers, looking for a way out, look for abstract techniques to eliminate the state(pure functions, dataflow, constraints, etc.) or sweep it under the rug(objects and configuration mechanisms) instead of "cleaning their room" by drawing up a giant decision table that enumerates all possible transitions.
I've made the transition table. It works. It produces a painful "rip off the bandaid" moment in that it forces out more of the specification all at once instead of allowing to accumulate iteratively. I'm pretty sure this makes it politically undesirable in many orgs.
I was trained as an EE and occasionally use a state machine in my code. Every other time I've encountered a state machine it was also written by someone trained as an electrical engineer or similar.
I tend to use them a lot when dealing with hardware (I'm an embedded software dev, so that happens reasonably often). Most hardware devices have some sort of state machine you need to keep track of. So hardware drivers tend to model the state machine, to track what the hardware is doing.
This was my experience as well. I could've gone my entire CS education without even needing to think about state machines outside of a short discussion in a formal methods class and regex, but taking a digital logic class really opened my eyes to the power of state machines. They clarify so much of software design by making implicit and adhoc parts of the design explicit. It really is a shame that most CS degrees don't seem to give state machines the credit they deserve.
I've had the same experience ... The other fun thing that blows mind is to use Karnaugh Maps to minimize the logic needed in complex condition statements.
Usually your state is more complicated than just one enum. You'll have counters, buffers, etc.
The only time I've ever used a state machine in real code is when parsing text one token at a time or something like that. It's a useful skill to remember for those situations. And this kind of problem is very over-represented in SWE interviews.
Here's a real-world example from my previous job. A process makes two simultaneous requests (we're on a tight dead line here). Here is the enumerated states we have to code for:
send 1,2 -> receive 1, receive 2, process
send 1,2 -> receive 2, receive 1, process
send 1,2 -> receive 1, timeout 2, process, receive 2 (which is ignored)
send 1,2 -> receive 1, timeout 2, process (2 is lost)
send 1,2 -> receive 2, timeout 1, process, receive 1 (which is ignored)
send 1,2 -> receive 2, timeout 1, process (1 is lost)
send 1,2 -> timeout 1,2, process, receive 1,2 (which are ignored)
send 1,2 -> timeout 1,2, process, receive 2,1 (which are ignored)
send 1,2 -> timeout 1,2, process, receive 1 (which is ignored; 2 is lost)
send 1,2 -> timeout 1,2, process, receive 2 (which is ignored; 1 is lost)
send 1,2 -> timeout 1,2, process (1,2 are lost)
You might think that after the process step, we can ignore anything we receive, but we need to make sure we ignore the timed out replies, because this "state machine" is running a few dozen/hundred times a second, depending upon load.
As I was leaving, there was talk of making a third concurrent request. States start exploding here. Maybe if there were languages that made this easy to implement they would be used more often (the code in question was in a mixture of C89/C++98).
I suspect this unintuitiveness is the main reason why we don't see more state machines, as even in the case where they're an optimal solution many SWEs will shy away from them.
Locally-rendered UIs are probably one of the best examples of where state machines shine. State-machine based ImGuis lower complexity by orders of magnitude, yet, people still resist.