Local State Is Poison (2012)

im3w1l · on Dec 27, 2013

I don't understand anything this article is saying. What is RDP? What are "live programming, open extension, and metacircular staged programming. "

What, and why would you want to upgrade during a loop?

>Those concepts are replaced by discovery – potentially in an infinite graph of stateful resources

What?

>To achieve large scale, robust, resilient, maintainable, extensible, eternal systems,

So many buzzwords...

Could someone give an example with before/after-pseudocode. I understand neither what the problem is nor the how the proposed solution is expected to solve it.

stcredzero · on Dec 27, 2013

What, and why would you want to upgrade during a loop?

VisualWorks Smalltalk VMs had this ability, provided you were a bit careful about how you modified the system. The system would update atomically, and if you were doing things like adding instance variables that safely defaulted values, the system could update in the middle of transactions, then complete them with the new code base. There was the caveat the adding an instance variable to a class meant that every instance of the class basically had to be recopied with the new slot, which could potentially pause the VM for a significant span of time while it performed this operation.

There are embedded RTOS that have similar capabilities. Sometimes, you just want to get very close to zero downtime.

arethuza · on Dec 27, 2013

To be fair, there is an "About RDP" link on the page:

http://awelonblue.wordpress.com/about/

mattmanser · on Dec 27, 2013

I personally find it extremely telling that there isn't a single line of code in the buzzword filled page.

The entire tree concept he is talking about seems like a misplaced faith that you can achieve that kind of separation.

My gut says that in moderate to large sized programs so much state would end up under the root, just out of laziness, time constraints, over-complexity or lack of programmer skill, that you'd have the worst of all worlds.

I just caught the end of the global variables era and have worked on a few programs where the state of things can get modified anywhere.

It is not pretty.

jamii · on Dec 27, 2013

While David Barbour does tend towards thinking about loud a great deal before coding (which is certainly sensible), he has also previously implemented these ideas in Sirea (https://github.com/dmbarbour/Sirea) and has just released a spec and interpreter for his Awelon language - http://awelonblue.wordpress.com/2013/12/25/awelon-project-pr...

kbenson · on Dec 27, 2013

I've worked on similar code. Global state which is sometimes twiddled directly in the main loop, other times functions are called that take no arguments and return nothing, but they compute something and set global variables.

Not pretty is an understatement. It's ugly at best, and rage inducing most the time.

dllthomas · on Dec 27, 2013

It helps when you can use the type system to restrain what can touch what. I've been doing this in some of my high performance C, where some global mutable state is unavoidable - passing around empty structs to indicate context. It doesn't guarantee I don't violate my rules, but it more likely I notice when I do.

kbenson · on Dec 27, 2013

It's almost never as bad when you design it yourself. Then the logic and reasons are apparent. Coming into such a system from the outside can be tough though. An explanation of the rationale and pre and post conditions is very helpful, even if all it does is keep a future programmer from trying to circumvent the system and causing the problems it was implemented to avoid.

yalue · on Dec 27, 2013

For me, the problem wasn't only this, but also that RDP is already a well-known acronym for "Remote Desktop Protocol".

One of my personal policies is to not introduce any new acronyms in the work I do, and especially don't redefine ones that are already familiar in the programming field. Just come up with a proper descriptive name, even if it's slightly longer. Let the acronym follow once the name is already known.

stcredzero · on Dec 27, 2013

A friend of mine who worked at the EPA came from a culture where they liked to stomp on existing computer field acronyms. I even once worked on a big company system called IO.

yalue · on Dec 27, 2013

I developed my distaste for pointless acronyms after working at a large company that had little regard for sensible terminology and would tack acronyms onto everything. Plenty of acronyms not only conflicted with established acronyms in the field, but also with others within the same company.

Seeing as how your friend worked for the EPA, maybe I shouldn't be surprised that this phenomenon often seems to surface within organizations whose name is typically given in the form of an acronym...

lhgaghl · on Dec 27, 2013

> What, and why would you want to upgrade during a loop?

See Erlang for a trivial example.

Udo · on Dec 27, 2013

The alternating aversion and draw of global state seems cyclical to me, and in many case the distinction is in name only. For example, JavaScript's scoping - even in code without "global" variables - is global-like in that the parent context is always inherited by sub-objects. Also, you couldn't invoke any of the built-in objects and functions without them being global in some fashion. It's in fact one of the reasons why it's so much fun to work with function objects. Yet, many JS programmers balk at the notion of globality whenever it's expressed explicitly.

One of the reasons global state gets a bad rap is because we're always trying to minimize side effects, and that's a worthy goal. The idea not only makes sure the program's components test well individually, it also enables better component re-use (although I'm still convinced that re-use without refactoring is mostly a myth in practice). However, in striving to ban explicit global state we have developed an astonishing array of cruft and complexity which counteracts the effects we wanted to achieve in the first place.

I agree with the article that well-defined and clean global data makes a lot of code easier to handle and it also eliminates unnecessary work, both on the human and the machine side. Of course, this idea breaks down again when the global substrate becomes muddled and structurally broken. At which point we've come full circle.

In my opinion a mixed approach is probably advisable for most projects and in fact, that's the way we already do things in many cases, even if we're forbidden to use the actual phrase "global variables". Judicious use of both paradigms yields the best results in my opinion. Maybe it's time to actually start calling global state by its real name, without expecting to be stigmatized for it.

jamii · on Dec 27, 2013

I think that a big part of the problem is pervasive access to global state. Given a block of code, it's impossible to tell what effect it (or its sub-calls) might have on the global state.

A common design pattern in clojure is to have a single massive data-structure that represents all application state, but to take it apart recursively whilst updating so that each function is only passed the pieces of state that are relevant. That way you can look at a function and immediately know what it can and can't read/write out of the global state.

A nice way of doing this in an imperative language might be to pass bits of the global data structure by reference (ala lenses) so that a given function can only read/write state contained in its arguments. The data is still global in the sense that it is all accessible from the root object and is not encapsulated, but access can be restricted on a per-function basis.

Udo · on Dec 27, 2013

> Given a block of code, it's impossible to tell what effect it (or its sub-calls) might have on the global state.

That's true. The equivalent effect in the "all local" paradigm is local objects the state of which can't meaningfully be understood or manipulated by neither the programmer nor other objects, leading again to unexpected behavior that is painful to track down.

A lot of the precautions that are common sense when working with global data are already instinctively followed (or at least understood) by most programmers I think. In effect the lenses are already working when there's an understanding when and where manipulation occurs. This mostly coincides with the notion that complex data should be manipulated by a well-defined model and wherever sensible there should be only one mechanism for doing it.

mercurial · on Dec 27, 2013

> That's true. The equivalent effect in the "all local" paradigm is local objects the state of which can't meaningfully be understood or manipulated by neither the programmer nor other objects, leading again to unexpected behavior that is painful to track down.

  def foo(x):
    x_plus_one = add_one(x)
    x_plus_one_times_two = times_two(x_plus_one)
    return x_plus_one_times_two

Here is a function with some local state. Which part do you feel "can't meaningfully be understood or manipulated by [...] the programmer"? I'm not trying to bait you, but I'm struggling to grasp the argument you are making.

lomnakkus · on Dec 27, 2013

Immutable local state is (as you've observed in that example) harmless, but once you start mutating closed-over state, you've given up completely on referential transparency[1] which gives you a lot of power when reasoning about what code does (and doesn't!) do.

[1] https://en.wikipedia.org/wiki/Referential_transparency_%28co...

mercurial · on Dec 27, 2013

Certainly, but you don't even need closures for this:

  def foo():
    a = []
    append_foo(a)
    return a

  def append_foo(a):
    a.append('foo')

But then I wouldn't call this "local mutability", in the sense that you modify your list in a different part of the code. Not a problem in a trivial example like this, definitively an issue when the actual action performed is ten levels deep.

davexunit · on Dec 27, 2013

I think closed-over mutable state certainly has its place. For example, delay and force in Scheme. Mutable local state is used to memoize the result of the delayed procedure. Of course, it would be terrible for a large program to keep all of its state within a closure. You eliminate the benefits of live-coding from your REPL at that point since you can only directly affect the top-level environment.

Udo · on Dec 27, 2013

Of course it's always possible to construct an infinite number of neat little functions that are not like that. However, I remember lots of times when I had to fight with some external libraries that had exactly the problem I'm describing. I don't quite understand how your illustration is relevant. The issue is not that simple and well-defined functions can't have any internal state that is easy to follow. I didn't argue this doesn't exist. I said that many times it's not what you find out there.

Abstractly speaking, when I say "A often has the property B", you can't meaningfully counter that with "you're wrong, here's an example where !B".

mercurial · on Dec 27, 2013

I think I have misread your comment. I thought you were making sweeping statements about local state in general. That said, I'd still appreciate an example. I haven't noticed any special issues wrt to understanding programs made of immutable-from-the outside objects or local variables.

usea · on Dec 27, 2013

Hi. I'm sorry to be pedantic, but the confusion here stems from your use of "equivalent" in responding to a quote that used "impossible." You never said "often," but you did imply you were describing an also-impossible thing.

Fortunately the confusion is cleared up now.

Gravityloss · on Dec 27, 2013

In a way, traditional static memory style C does this.

You can't return locally created variables [beyond primitives].

Beyond simple primitive reading/returning functions, you often create some arrays or structs, then pass a pointer to the necessary arrays and structs to your functions, so it can read and write them.

This way the functions can stay pure.

chas · on Dec 28, 2013

Taking pure to mean referentially transparent, I'm not sure I understand your point here. If you are mutating a statically allocated struct or array passed by reference, doesn't that have a very similar effect to global variables? It makes the dependence explicit rather than implicit which is useful, but it is still effectively global state.

davexunit · on Dec 27, 2013

>For example, JavaScript's scoping - even in code without "global" variables - is global-like in that the parent context is always inherited by sub-objects. Also, you couldn't invoke any of the built-in objects and functions without them being global in some fashion.

This is just free vs. bound variables. You're absolutely right that it would be a pain to write things such that there are no free variables from parent environments. However, those functions aren't necessarily from the top-level (global) environment.

>However, in striving to ban explicit global state we have developed an astonishing array of cruft and complexity which counteracts the effects we wanted to achieve in the first place.

I think it's important to avoid global, mutable state in most cases. Relying on functions and constants from the top-level environment is fine, but storing a program's state in the top-level environment is just asking for a whole lot of painful debugging.

justincormack · on Dec 27, 2013

Well JavaScript is not necessarily the best example. Lua gives you access to the environment a function is in (_ENV in Lua 5.2), which gives you control of the global.

w_t_payne · on Dec 27, 2013

I largely agree with the poster, although I would couch the argument in different terms, since I think the global-vs-local dichotomy might be an orthogonal matter.

Our brains struggle to reason about how state evolves over time. Add in concurrency, and the problem easily becomes intractable. On top of this, testing stateful components is burdensome.

So, the state needs to be kept as separate as possible from the complex algorithmic logic of the application, so that the state-handling-parts can be kept as simple as possible, and the complex parts can be kept as easily-testable as possible. If this means that the state is handled globally, then fine, but it is not really about where the state is held, but rather about how easy is it to reason about and test.

My rule of thumb is this: we should be able to test our complex mathematical and algorithmic components as stateless (pure) functions, independently of any stateful parts of the application. The remaining stateful parts of the application should have a simple and well understood lifecycle, preferably well away from any concurrency, and with tightly controlled and documented state transitions. (OOP is handy for this, although it must be kept on a tight leash).

hexagonc · on Dec 27, 2013

This seems kinda interesting. I don't know if it would solve all the problems of global state but it would definitely make serialization of the program state simple. It wouldn't even be that difficult to implement in a language that has a LISP-like syntax.

A first stab at the "tree-shaped resource space" referred to in the article would be the abstract syntax tree of the program itself. Each node would have a unique URI, which can be a physical directory path on a filesystem or can be stored in a database structure. Every local variable would be defined by a path in a flat namespace. The "parent" directory of the local variable would be the function it is defined in. Security rules can be created that simulate many of the features of variable scoping rules. The most basic rule, that variables are only visible within the scope of their parent function simply means that the only variables that can be referenced within a function are those within the same directory. Again, none of this seems too difficult to implement especially if your language uses a LISP syntax.

I'm tempted to implement a toy version of this, if for no other reason, because I've been wondering about good ways to serialize the program state of a DSL that I've been working on. Performance seems to be the big problem with using a database or filesystem. A global map of URI's (that is easy to serialize) with some sensible access/permission strategy doesn't seem too bad and could be transparent to the developer.

notacoward · on Dec 27, 2013

I wrote about the related idea of implicit vs. explicit state back in 2005.

http://pl.atyp.us/wordpress/index.php/2005/07/explicit-state...

Briefly, the kind of global state that's needed for debugging should be easy to find "from outside" - which precludes local variables along with other common idioms. It might still be distributed, and that can still be problematic, but the key point is that many ways of avoiding global state are worse than what they avoid. Global state itself is not the problem; inadequately contained or constrained changes to it are, and there are other solutions besides elimination.

jamii · on Dec 27, 2013

I like that explanation. I may link to your post in future rather than dmbarbours as it requires less background understanding.

sebastianconcpt · on Dec 27, 2013

When you invoke a object class to create an instance, you are invoking a global shared, so all things with its merit.

“local state is good, global shared state is bad” and all those (very easily over)simplistic kind of thoughts are like pain killers. They might alleviate you in a moment of affliction but they can be also be addictive beyond the point of benefit. In that regard, yes, something could be poison.

Your line of thought here will make you converge to invigorate some kind of proceduralism. Sorry I don’t know what your domain problem is but you seem to be experiencing an object oriented overhead that you feel like starting to hurt.

You can go ahead and proceduralize things (functions against a remote datastore) but I wouldn’t be so fast in questioning the object design fundamentals. I'd try harder* to remove the original painful overheads or whatever real pain is in your design.

*by harder I don’t mean to be muscular or that you aren’t paying effort to it. Harder could mean to do something as easy as asking to hacker friends to use their fresh unbiased view for a problem/code review.

Listen all, pay attention to some, then ignore everybody (including me)

judk · on Dec 27, 2013

He isn't saying that all state should be accessible to all functions, he is saying that all state (including subtle state like call stacks) should be colocated in a data store.

The data in the store can still be protected (with access tokens or existential types or whatever) so that an item can be only accessible by parts of the code that have the "key".

davesims · on Dec 27, 2013

I think it's a bit of a category mistake to call filesystem or database data 'global state' in this context. Virtually every application has external persistence of some kind and some way of referencing that persistence. Technically those references -- variables or classes that manage connection pools or IO utilities, etc. -- can be called 'global state', but that's generally not what the CS literature is talking about when it says 'avoid global state'. The Evil Global State of CS lore generally refers to globally accessible static or class-level values that refer to in-memory structs, objects or values of some kind, in the stack or heap, rather than external persistence. Filesystem and database access is usually taken for granted, and is ceded as the unavoidable level of 'global state':

http://c2.com/cgi/wiki?GlobalVariablesAreBad

State, in general, at any scope, can make things difficult no doubt. But global state is classically bad because it's hard to reason about across large chunks of distributed code, pollutes namespaces, creates concurrency nightmares, etc. Reducing the scope of state to a manageable range of, say, less than a dozen lines of code, into short-lived references, is clearly far better than the alternative and a reasonable approach in the vast majority of cases. Calling it 'poison' ratchets the rhetoric way beyond the gravity of the problem. No, local state is not considered harmful.

What it seems OP is really talking about in practical terms is pure stateless programming, where the application has no implicit or explicit references to a value whose authoritative data resolves in main memory. If you were to tell me the only state you have in your application is Filesystem or database data, "just beyond the edges of our program logic," I'd say you'd basically achieved the fabled 'stateless' programming ideal, long held as a kind of Holy Grail of functional application development, and as OP points out, that's not often achieved even in the strictest functional environments.

I don't want to diminish the points made, the article was instructive to me as yet another anecdote about the perils of shared mutable state at any scope. But the fundamental principle, that one should avoid shared mutable state as much as possible -- which is the upshot of the essay -- has been axiomatic for quite some time.

jamii · on Dec 27, 2013

> If you were to tell me the only state you have in your application is ... "just beyond the edges of our program logic," ... that's not often achieved even in the strictest functional environments.

It's actually a pretty common design pattern in clojure to keep all application state in a single datastructure (eg http://www.chris-granger.com/2013/01/24/the-ide-as-data/ http://thinkrelevance.com/blog/2013/06/04/clojure-workflow-r... http://channel9.msdn.com/posts/Rich-Hickey-The-Database-as-a...).

> But the fundamental principle, that one should avoid shared mutable state as much as possible -- which is the upshot of the essay...

I think you missed the point. The OP is arguing that even non-shared mutable state should not be encapsulated away but should be accessible from some root data-structure. That way you can eg serialise the whole state of your program and restart it elsewhere or traverse the state with debugging and monitoring tools.

He points out that the traditional evils of global state (unrestrained mutation, non-reentrant code) have been solved in filesystems and databases and that those solutions could equally be applied to keeping state in-memory.

In other words, separate data from logic and keep all of your data in one place (whether that be a database, file-system or some well-controlled in-memory structure).

judk · on Dec 27, 2013

You missed the point, sorry. Awelon is about moving data like call stacks out of memory and into the database.

The contribution isn't the non-novel insight that mutable state is bad, it is that Awelon is attempting to actually solve the problem of how to remove "program state" from a program.

viraptor · on Dec 27, 2013

Sounds like what lots of telcos do with dap/ldap already. And there are some insanely fast in-memory implementations of it.

Maybe not to eliminate the local state itself, but definitely for the organisation / state sharing / layered security / persistence and many other things he listed.

jamii · on Dec 27, 2013

It's also the standard way to architect web apps, with stateless logic in the servers connecting to the stateful database.

What's interesting about the OP is the idea of making that the only source of state, so that every other language construct is a pure function of its inputs.

dllthomas · on Dec 27, 2013

An interesting approach might be keeping state global, but providing projections through which certain parts of the code must access that state.