Continuations, GC, and Threading

Tuesday 23 May, 2006, 12:01 PM

James Robertson posted this reply: "Trust the GC" to my recent post on continuations. Apparently he almost completely failed to understand what I wrote.

I wrote and posted an extensive reply to his blog comments. Once his site had accepted the comment I deleted the draft. So when I went back today to see if he had replied, I was a little annoyed to see my comment wasn't there. (Assuming he's not actually censoring me, this doesn't exactly speak volumes for the quality of his server implementation. I mention this because he cites his own server implementation in his arguments... Although to be fair, it's is a common problem - this is one reason why I don't like blog comments.)

Anyway, since replying on his site is apparently not an option for me, I'm replying here instead. I'll take the main points one by one.

Not All Resources are Memory

I had drawn attention to the fact that when using continuations to allow function execution to span multiple web page round trips, resource handling is easier to get wrong. His reply was:

"I don't know why this would be a problem, given a halfway decent GC system. The stale session will eventually time out, and take any lingering state down with it."

By the time the session times out it's far too late.

I can only assume that for Robertson, the term 'resource' means something much narrower than it does for me. I consider things like sockets, file handles, lock ownership, database connections, and transactions all to be forms of resource. No competent developer would allow such resources to be held around until session timeout, so presumably Robertson didn't realise I was talking about this sort of thing. So James, if you're reading this, please re-read my original article substituting whatever term you deem fit for 'resource' and see if you still disagree with me.

Garbage collection is a notoriously bad way of dealing with these kinds of resources. And just to be clear, this is not a quality issue: any readers tempted to offer a patronising comment along the lines of "well if you used a half-decent GC instead of this crappy .NET stuff you wouldn't have this problem" would be missing the point. It doesn't matter how good your GC is, and Robertson nails the reason: the GC can't do anything about the resources until the session is torn down. (Or they become unreachable. But remember, my point here is that because continuations enable scopes to span multiple user requests, it becomes all too easy to peg the reachability of objects to the session lifetime.)

I hope it's obvious that it's almost always a design error to try and keep hold of scarce or unique resources across multiple end user round trips. I think it's also clear that the continuation style doesn't force you into making this mistake. My point was that it makes easy to make these mistakes - so much so that you could do it by accident. In the traditional web app model, you would never accidentally start a transaction in one request, carefully store it in the session, and then retrieve it in the next request. But with the continuation model, you do that with no effort whatsoever, and in a way that doesn't stand out as obviously wrong.

In short, certain mistakes are much easier to make, and are much less visible. This is bad. Not terminal, but definitely bad.

Threading

The threading example Robertson describes here appears to have absolutely nothing to do with the issue I raised. He seems to be talking about forking off work in a separate process to do work subordinate to the current request, blocking completion of the current request until that subordinate work is done.

The issue I raised was something else entirely. I'm talking about a scenario involving multiple user requests one after another. Each request could be handled on a different thread. (Unless you've taken the drastic scalability-limiting step of pegging a session to a thread.) If you've used the continuation model to have one function's execution spanning multiple requests, a single logical invocation of what was written as a single function can end up switching threads several times during the course of execution.

It is possible to write code that will tolerate this, of course. My point is that it adds a kind of complexity you never normally encounter within a single method. Once again, it makes certain kinds of bugs much less visible. And it's a style of programming most people are not used to - even people who are accustomed to writing heavily multithreaded code will not be used to this. It's just not normal to have your function flit from one thread to another during a single invocation.

The Back Button

Robertson says that handling the back button was one of the things that this continuation approach is designed to solve. I'm prepared to believe that - well-intentioned but misguided attempts to solve problems are pretty common in the computer industry. And when they wrap essential complexity in an inappropriate but appealingly simply abstraction, these bad ideas are often rather popular.

I'm not claiming that it's impossible to handle the back button this way. I'm just saying it's a bad idea, because you need to be able to write code that can tolerate being wound back and forth at the user's whim.

As someone emailed me to point out, this is not a problem if you're writing pure functional code. Code that has no side effects can easily tolerate being executed multiple times, and it doesn't matter if execution proceeds some distance down a certain path and then backtracks and goes elsewhere. As long as your code never actually does anything, this continuation-based approach may actually make your life easier. (Although perhaps not your users' lives, as we'll see shortly.)

However, consider the example I gave - some sort of web-based ordering system. When processing orders, it's usually considered important to keep some kind of persistent record whenever an order is placed. Side effects are a fact of life in any interesting web application. So purely functional code might not actually be an option. Moreover, it might not be desirable for the end user.

There's a problem with this continuation-based approach that's more insidious than the technical issues. It might be that you can 'eliminate problems' in the sense that there are strategies you can use to avoid the technical pitfalls, but you might end up with a really crummy interaction model for your web site. For example, suppose I go to my 'basket' view in an ecommerce site. Then I follow a link in a new tab. In this new tab I add a bunch more items to my basket. And then, because I hate waiting for web pages to load, I go back to the basket that's still sat there in the first tab and click checkout. What should happen?

The 'logical' thing might be for that first tab to continue from where it left off: to let me purchase everything that was in the basket at the point when I showed the page. I'm sure there are some developers who can convince themselves that this is The Right Thing - after all, this way we'll be buying everything that's shown on that web page right now. And maybe that's what'll happen - the web server will retrieve the relevant continuation, and in its scope there will be the old basket contents, and we go from there.

But as a user, I'd call that the wrong thing to do. I want side-effects. Consider a real shopping basket. I'm in the checkout queue, and then realise I want one more thing. I fetch it, and then rejoin the queue. My basket's contents don't revert just because I took an eccentric route through the store. So at a bare minimum I want it to proceed with the new contents. Arguably it should point out to me that my basket contents have changed somewhere near the top of the checkout page, giving me the option to review them if I wasn't expecting that. Reverting to whatever happened to be in some particular scope at the time the page was served up is the wrong thing to do.

But...but...

Yes yes. I know that it's possible to avoid all of these problems. I am not trying to claim that the approach is unworkable. So if you're about to email me or blog about workarounds to the issues I've raised here, don't waste your effort - I'm not arguing that any of these errors are inevitable. I'm just saying that this approach is a minefield. And in my opinion, good tools shouldn't explode when you put a foot wrong.

So, just to clarify, my two main issues with this technique are: 1) it makes certain classes of bug much easier to introduce and much harder to spot; 2) execution of code is a very bad abstraction for user journeys because these two things have completely different shapes.

April (2018)	(1 item)
August (2014)	(1 item)
July (2014)	(5 items)
April (2014)	(1 item)
March (2014)	(1 item)
January (2014)	(2 items)
November (2013)	(2 items)
July (2013)	(4 items)
April (2013)	(1 item)
February (2013)	(6 items)
September (2011)	(2 items)
November (2010)	(4 items)
September (2010)	(1 item)
August (2010)	(4 items)
July (2010)	(2 items)
September (2009)	(1 item)
June (2009)	(1 item)
April (2009)	(1 item)
November (2008)	(1 item)
October (2008)	(1 item)
September (2008)	(1 item)
July (2008)	(1 item)
June (2008)	(1 item)
May (2008)	(2 items)
April (2008)	(2 items)
March (2008)	(5 items)
January (2008)	(3 items)
December (2007)	(1 item)
November (2007)	(1 item)
October (2007)	(1 item)
September (2007)	(3 items)
August (2007)	(1 item)
July (2007)	(1 item)
June (2007)	(2 items)
May (2007)	(8 items)
April (2007)	(2 items)
March (2007)	(7 items)
February (2007)	(2 items)
January (2007)	(2 items)
November (2006)	(1 item)
October (2006)	(2 items)
September (2006)	(1 item)
June (2006)	(2 items)
May (2006)	(4 items)
April (2006)	(1 item)
March (2006)	(5 items)
January (2006)	(1 item)
December (2005)	(3 items)
November (2005)	(2 items)
October (2005)	(2 items)
September (2005)	(8 items)
August (2005)	(7 items)
June (2005)	(3 items)
May (2005)	(7 items)
April (2005)	(6 items)
March (2005)	(1 item)
February (2005)	(2 items)
January (2005)	(5 items)
December (2004)	(5 items)
November (2004)	(7 items)
October (2004)	(3 items)
September (2004)	(7 items)
August (2004)	(16 items)
July (2004)	(10 items)
June (2004)	(27 items)
May (2004)	(15 items)
April (2004)	(15 items)
March (2004)	(13 items)
February (2004)	(16 items)
January (2004)	(15 items)

IanG on Tap

Blog Navigation

Writing

Other Sites