IanG on Tap

Ian Griffiths in Weblog Form (RSS 2.0)

Blog Navigation

March (2014)

(1 item)

January (2014)

(2 items)

November (2013)

(2 items)

July (2013)

(4 items)

April (2013)

(1 item)

February (2013)

(6 items)

September (2011)

(2 items)

November (2010)

(4 items)

September (2010)

(1 item)

August (2010)

(4 items)

July (2010)

(2 items)

September (2009)

(1 item)

June (2009)

(1 item)

April (2009)

(1 item)

November (2008)

(1 item)

October (2008)

(1 item)

September (2008)

(1 item)

July (2008)

(1 item)

June (2008)

(1 item)

May (2008)

(2 items)

April (2008)

(2 items)

March (2008)

(5 items)

January (2008)

(3 items)

December (2007)

(1 item)

November (2007)

(1 item)

October (2007)

(1 item)

September (2007)

(3 items)

August (2007)

(1 item)

July (2007)

(1 item)

June (2007)

(2 items)

May (2007)

(8 items)

April (2007)

(2 items)

March (2007)

(7 items)

February (2007)

(2 items)

January (2007)

(2 items)

November (2006)

(1 item)

October (2006)

(2 items)

September (2006)

(1 item)

June (2006)

(2 items)

May (2006)

(4 items)

April (2006)

(1 item)

March (2006)

(5 items)

January (2006)

(1 item)

December (2005)

(3 items)

November (2005)

(2 items)

October (2005)

(2 items)

September (2005)

(8 items)

August (2005)

(7 items)

June (2005)

(3 items)

May (2005)

(7 items)

April (2005)

(6 items)

March (2005)

(1 item)

February (2005)

(2 items)

January (2005)

(5 items)

December (2004)

(5 items)

November (2004)

(7 items)

October (2004)

(3 items)

September (2004)

(7 items)

August (2004)

(16 items)

July (2004)

(10 items)

June (2004)

(27 items)

May (2004)

(15 items)

April (2004)

(15 items)

March (2004)

(13 items)

February (2004)

(16 items)

January (2004)

(15 items)

RSS 2.0


Programming C# 5.0

Programming WPF

Other Sites

Interact Software

Recursion Schemes and Functors

Thursday 20 March, 2014, 08:47 AM

Erik Meijer recently tweeted a link to Patrick Thomson’s post on recursion schemes. (Erik was the driving force behind two of my favourite technologies of the last decade: LINQ and Rx. If you’re interested in programming languages, he’s well worth following on Twitter: @headinthebox.) It was an interesting article, but I felt it had a small issue: it conflates a very useful idea (recursion schemes) with some related but not strictly necessary Haskell-isms: the Functor type class, and fixed points of recursive-defined parameterised types.

I only dabble in Haskell, so it’s possible I’ve got this wrong. I was going to contact the article’s author (Patrick Thomson) directly for clarification, but couldn’t find his email details, so I decided to post this publicly, with the hope of drawing his attention to it via Twitter. I may yet be embarrassed if he points out a fatal flaw in my argument. Ah well.

Recursion Schemes

The idea at the heart of Patrick’s post seems to be this: we should separate the mechanisms for traversing data structures from the way in which we would like to process the data. His example is a data structure representing a simple syntax tree, and a ‘flatten’ method which walks through an expression tree removing parentheses. Initially, this flattening logic is intermingled with the code that walks the tree:

flatten :: Expr -> Expr
flatten (Literal i)     = Literal i
flatten (Paren e)       = flatten e
flatten (Index e i)     = Index (flatten e) (flatten i)
flatten (Call e args)   = Call (flatten e) (map flatten args)
flatten (Unary op arg)  = Unary op (flatten arg)
flatten (Binary l op r) = Binary (flatten l) op (flatten r)

There’s only one line of interest here, and it’s the version of the function that takes a Paren expression. The rest is essentially boilerplate that describes how to walk through his syntax tree data type.

By the end, he has two functions, topDown and bottomUp, which encapsulate two styles of tree traversal. This enables him to separate out the actual logic—flattening of parentheses—into a much simpler function:

flattenTerm :: Term Expr -> Term Expr
flattenTerm (In (Paren e)) = e  -- remove all Parens
flattenTerm other = other       -- do nothing otherwise

It’s very much easier to understand what this does (strips out parentheses, leaving everything else in place) than it was with the previous example, because it expresses only the important logic. To use it, we combine it with one of Patrick’s traversal functions:

flatten :: Term Expr -> Term Expr
flatten = bottomUp flattenTerm

This walks the syntax tree in a bottom-up fashion, passing each node through that flattenTerm function.

Clever Stuff

There are a couple of particularly clever aspects to the way Patrick achieves this. First, his topDown and bottomUp functions don’t know anything about his specific data type. They could work for any recursive data type. Second, he takes advantage of the Haskell compiler’s ability to generate code that can traverse a recursive data type for you; not only can you separate all the structural boilerplate I showed in the first example from the important logic, you don’t even have to write any of that code yourself.

He then goes on to do one more very clever thing: he recursively constructs a parameterised data type which is, in a sense, infinite. This is an excellent party trick, but it is also where I part company from Patrick a little. My view is that the only reason he needs this clever trick is because he’s been forced into it by a shortcoming of how the compiler generates traversal code. An ideal solution would, in my view, enable you to avoid this. But he seems to regard it as front and centre, rather than an unfortunate workaround:

“That these definitions emerged naturally out of fixed-points and functors, two concepts central to Haskell and to functional programming in general, is doubly amazing.”

To show why I disagree with Patrick on this point, I’ll show two things. First, I’ll show how he was effectively forced into the use of fixed-points as a result of a Haskell feature that suffers (in this particular application) from being a little too general. Second, I’ll show that it’s entirely possible to write equivalents of his topDown and bottomUp functions without using either fixed-points or Haskell’s Functor type class, and that the result is significantly simpler. (There is a loss of generality, but as I hope to show, that generality only adds unwanted complexity in this particular case, and cannot be exploited in practice.)

The Path to Functors and Fixed Points

Patrick starts with the excellent goal of not wanting to write or maintain boilerplate code that walks through his tree-like data structure. In most cases, writing this code is a mechanical process, driven entirely by the structure of the data type. It’s tedious to write, and easy to get wrong, and it essentially duplicates information that was already inherent to the data type’s definition. This is exactly the sort of thing that should be automated.

GHC (the most widely used Haskell compiler) has an optional feature that can write this code for you. This isn’t part of any official specification as far as I can tell; it’s an extension that you need to enable with either a command line switch, or a pragma in your source code:

{-# LANGUAGE DeriveFunctor #-}

With that in place, you can write this sort of thing:

data Foo a
  = NoFoos
  | OneFoo a
  | TwoFoos a a
  deriving (Show, Eq, Functor)

This defines a parameterized type. (Given my normal subject matter, I’m guessing most of my readers have a C# background, so if you’re familiar with C# but not Haskell, then firstly, thanks for reading this far, and secondly, that’s sort of like defining a generic type Foo<a>.) This is a slightly pointless sort of a container—it’s only for illustrative purposes here—that can contain either zero, one, or two items of whatever type you choose for the type parameter. For example OneFoo '!' is a Foo Char containing a single character; TwoFoos True False is a Foo Bool containing two Boolean values.

The interesting part is the final line: the deriving keyword tells the compiler that I’d like it to produce some code for me that makes Foo an instance of various type classes. Only a strictly limited set of known classes is supported here, because the compiler only knows how to generate code for certain types. In this case I’m asking it to generate code for the Show class (which enables any Foo to be turned into a string for display purposes), the Eq class (supporting value comparison) and Functor.

That last one is what enables traversal. Any type f that is an instance of the Functor class provides an fmap function with this signature:

fmap :: (a -> b) -> f a -> f b

The general idea is that a Functor is a container, and that fmap lets you apply a function over all of the items in a Functor. For example, a list is a Functor, so I could use fmap to square all the numbers in a list:

*Main> fmap (\x -> x * x) [1..5]

The function is allowed to change the type if it wants, so I might transform a list of numbers to a list of strings (using the show function available on all types that are an instance of the Show class; all numeric types are in that class):

*Main> fmap show [1..5]

So broadly speaking, a Functor is a container for some particular type of data, and it might contain any number of pieces of data of that type, and we can use fmap to apply a transformation across all of those pieces of data, producing a new container with the same structure as the original, but holding that transformed data.

So what does that mean for our Foo type? We asked the compiler to provide Functor Foo for us by using the deriving keyword, but this just causes the compiler to generate code that looks more or less like this:

instance Functor Foo where
  fmap f NoFoos        = NoFoos
  fmap f (OneFoo a)    = OneFoo (f a)
  fmap f (TwoFoos a b) = TwoFoos (f a) (f b)

So just as I was able to use fmap to square all the numbers in a list I can now use it to square all the numbers in a Foo, as long as the Foo contains numbers (or contains nothing):

*Main> fmap (\x -> x * x) NoFoos
*Main> fmap (\x -> x * x) (OneFoo 2)
OneFoo 4
*Main> fmap (\x -> x * x) (TwoFoos 3 4)
TwoFoos 9 16

Likewise, I can apply the show function to all the items contained in a Foo, just like I did earlier with a list:

*Main> fmap show NoFoos
*Main> fmap show (OneFoo 2)
OneFoo "2"
*Main> fmap show (TwoFoos 3 4)
TwoFoos "3" "4"

So by putting Functor in the deriving list, the compiler generates fmap for our data type. And this is what Patrick takes advantage of—it enables him to avoid writing boilerplate for traversing his syntax tree data type.

However, there’s a problem: the generated fmap is all very well when our container doesn’t really care what it contains, but what if we want a recursive data type? Patrick’s example is a tree-like structure representing expressions—an expression may contain child expressions. Although his type is conceptually simple, it’s large enough to be slightly unwieldy, so I’ll be using a substantially simplified type that can still illustrate the same idea:

data Bar
  = Node Bar Bar
  | Leaf Int
  deriving (Show, Eq)

This lets us build very simple binary trees, where the leaves always contain a single Int. It’s a recursive data structure—for non-leaves, a Bar contains two child Bar items. Now you might reasonably want to process all the elements in such a tree like we’ve been doing already. Here’s a function that squares everything in a Bar.

squareBar :: Bar -> Bar
squareBar (Leaf i) = Leaf (i * i)
squareBar (Node l r) = Node (squareBar l) (squareBar r)

And here it is in use:

*Main> squareBar (Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))
Node (Node (Leaf 1) (Leaf 4)) (Leaf 9)

That’s lovely, but we’re now back in the world of mingling our traversal with our functionality—squareBar is a mixture of code that knows how to traverse a Bar, and code that performs an operation (squaring the numbers). That’s no good, so how about we just add Functor to the list of classes in deriving? But that won’t work—the compiler complains:

Cannot derive well-kinded instance of form `Functor (Bar ...)'
  Class `Functor' expects an argument of kind `* -> *

The basic problem here is that fmap (which any Functor must supply) is able to change the type of the data it works on—as you saw earlier, I can transform a list of numbers to a list of strings. But my Bar type takes no type parameters, so there’s no way to produce a Bar that contains strings. So this is not a type that fmap can work for.

Now we could easily get rid of the error thus:

data Bar a
  = Node (Bar a) (Bar a)
  | Leaf a
  deriving (Show, Eq, Functor)

This works, and I can now write squareBar using the compiler-generated fmap:

squareBar :: (Bar Int) -> (Bar Int)
squareBar = fmap (\x -> x * x)

And this works as expected:

*Main> squareBar (Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))
Node (Node (Leaf 1) (Leaf 4)) (Leaf 9)

However, this doesn’t really solve the problem I want to solve: fmap can only operate on the values stored in the leaves. What if I want to do something to the non-leaf nodes? For example, going back to my original non-parameterized Bar, I could write this:

leftBar :: Bar -> Bar
leftBar (Leaf i) = Leaf i
leftBar (Node l r) = Node (leftBar l) (Leaf 0)

This walks the tree, and, rather arbitrarily, for each non-leaf node it replaces the right-hand child with a leaf with a value of 0. Here it is in action:

*Main> leftBar (Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))
Node (Node (Leaf 1) (Leaf 0)) (Leaf 0)

This is obviously pointless, but the general idea—being able to transform and possibly replace any node rather than just the leaves—is useful. Indeed, that’s exactly what Patrick’s doing in his example: he’s stripping out certain nodes (parentheses) from the syntax tree.

The problem with my parameterized type is that the generated Functor code targets the wrong thing: it lets me transforms the numbers in the leaves, and not the nodes themselves. In some situations the leaf values might be exactly what I want to change, but not here.

And this is where Patrick starts jumping through hoops. Rather than using the type parameter for the value in the leaves as I did, he uses it for the children. Here’s how that would look with my type:

data Bar a
  = Node a a
  | Leaf Int
  deriving (Show, Eq, Functor)

Now this is kind of weird: this type appears to say that non-leaf nodes no longer have to contain two Bar children. The children are of whatever type is supplied as our argument. But we do actually want those to be of type Bar, because this is, after all, supposed to be a simple tree-like recursive data structure. But that’s no longer clear at a glance.

The compiler may be able to do more work for us, but the price is a loss of clarity.

For our type to work as intended, we want the type argument for Bar to be Bar. But we can’t just say Bar Bar, because that second Bar itself needs a type argument. So how about Bar (Bar Bar)? Well again, we’ve got a problem because that last Bar needs a type argument too. So you want a sort of infinite type: Bar (Bar (Bar (Bar …etc))).

Fortunately, Haskell is a sufficiently clever language that you can create such a type. (The resulting infinite type is the “fixed point” Patrick refers to.) It is possible to define some type X such that X is synonymous with Bar X. This implies that Bar X is synonymous with Bar (Bar X) and also Bar (Bar (Bar X)) and so on. This in turn means that if you have a Bar X, then its children in non-leaf nodes are also of type Bar X: we can have the recursive data structure we want.

Patrick achieves this by writing what is effectively a Y combinator in the type system. Here’s something equivalent to what he does (although he calls it Term):

data Mu f = In (f (Mu f))
out :: Mu f -> f (Mu f)  
out (In t) = t

If we then write Mu Bar, that turns out (for slightly brain-melting reasons) to denote a type which has the characteristics I described for “some type X” a few paragraphs ago.

Where does this get us? Well it enables us to use a parameterized definition of Bar in which the children of non-leaf nodes (rather than the leaves) use the type parameter while still being of the same type as their container. This in turn means that the compiler-supplied Functor implementation now works over the nodes rather than the values in the leaves.

However, it’s all now rather inconvenient. Even if you’re able to take the fixed point of an infinitely recursive type definition in your stride, working with trees now becomes a good deal more inconvenient because we have to build everything via the In constructor in order to use the fixed point type. So instead of:

(Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))

we must now write:

(In (Node (In (Node (In (Leaf 1)) (In (Leaf 2)))) (In (Leaf 3))))

These In functions tend to litter our code, as does the related out function when we want to work with the information in our tree. And it’s all because Functor demands that its type argument is itself a parameterized type. And the only reason it requires that is to allow the function we pass to fmap to change the type as part of its transformation. But if you’re still keeping up, you’ll notice that we can’t actually take advantage of that! The whole point of jumping through this fixed point type hoop was to enable our Bar to contain children of type Bar, so in practice, the first argument to fmap will always be (Mu Bar -> Mu Bar). All this complexity arises because of the generality of fmap, generality that we cannot in fact exploit.

A Simpler Approach

Instead of fmap’s too-general signature:

Functor f => fmap :: (a -> b) -> f b -> f b

all we really need is this:

efmap :: (a -> a) -> a -> a

Or to put that back into the context of our example, when we’re walking a tree, we’re always going to be mapping nodes onto nodes. In my case, that means a Bar is always mapped to a Bar; in Patrick’s example, that means an Expr is always mapped to an Expr. This is always the case in practice for these examples—when using fmap in this way, its two type arguments (a and b) will always refer to the same type.

I’ve called my degenerate fmap efmap, because it think a functor that maps onto the same type as its input is called an endofunctor. However, I’m not totally sure about that—I’ve read in a couple of places that all Haskell functors are technically endofunctors. I don’t understand the arguments that lead to this conclusion, so I’m evidently not yet enlightened, so perhaps this is the wrong term to use here, but I’ll stick with it for now.

We could define our own class for this:

class Endofunctor a where
  efmap :: (a -> a) -> a -> a

This is just a degenerate version of Functor, in which two type arguments have become one. This simplification means that an Endofunctor is not required to be a parameterised type, unlike a Functor. As far as I can tell, Haskell does not define any such class itself. I’m not entirely sure why, which leads me to suspect I’m missing something. But as I’ll show, this does appear to lead to a simpler result.

Of course, Haskell can’t auto-generate an implementation of Endofunctor for me, but it’s easy enough to write one for my Bar type. (I’m reverting to the original non-parameterized version, by the way.)

instance Endofunctor Bar where
  efmap f (Leaf i)   = Leaf i
  efmap f (Node l r) = Node (f l) (f r)

And then, instead of Patrick’s topDown and bottomUp (modified to use my Mu instead of his Term):

topDown, bottomUp :: Functor f => (Mu f -> Mu f) -> Mu f -> Mu f
topDown f  = In <<< fmap (topDown f) <<< out <<< f 
bottomUp f = out >>> fmap (bottomUP f) >>> In >>> f

we have these much simpler versions that don’t need to bother with the wrapping and unwrapping necessitated by the fixed point type:

bottomUp, topDown :: Endofunctor a => (a -> a) -> a -> a
bottomUp fn = efmap (bottomUp fn) >>> fn
topDown fn  = efmap (topDown fn) <<< fn

As with Patrick’s code, this illustrates the duality between bottom-up and top-down by “reversing the arrows” but with more elegance, I believe.

Likewise, the code that performs the transformation I’m actually interested in is free from all this messy Mu/Term, In/out stuff:

leftBarNode :: Bar -> Bar
leftBarNode (Node l r) = Node (leftBarNode l) (Leaf 0)
leftBarNode other = other

And I no longer need to pepper my tree construction with loads of In functions—I can just construct it directly. Here I’m using that leftBarNode with the bottomUp recursion scheme:

*Main> bottomUp leftBarNode (Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))
Node (Node (Leaf 1) (Leaf 0)) (Leaf 0)

Let’s step back and see where this has taken us.

I have been able to encapsulate recursion schemes in separate functions (topDown and bottomUp) just like in Patrick’s solution, and these are expressed in terms of a type class (Endofunctor, effectively a degenerate form of Functor), enabling each individual data type to define its own structure for traversal purposes. This enables the logic of interest to be separated out from the mechanisms of recursive traversal. It was not necessary to invoke any infinitely nested type magic (no type system Y combinator, no fixed points). This removed all of the associated wrapping and unwrapping clutter, and avoided the cognitive overhead of a recursive data type not being self-evidently recursive. I was able to get away from all that by avoiding the Functor class, which wreaked havoc by being too general for the task at hand.

And that’s why I take issue with Patrick’s conclusion:

“The fact that we can express the duality between top-down and bottom-up traversals merely by ‘reversing the arrows’ that determine our code’s flow, all the while retaining generality and type safety, is nothing short of amazing. That these definitions emerged naturally out of fixed-points and functors, two concepts central to Haskell and to functional programming in general, is doubly amazing.”

The first sentence is true for both Patrick’s and my approach. But I think my code demonstrates that these things do not depend on fixed-points, and while they may depend on endofunctors, they do not depend on functors in the specific guise of Haskell’s Functor type class. On the contrary, fixed points were necessary to work around a problem introduced by the use of Functor and they introduced considerable accidental complexity, as well as reducing the clarity of the data type definition. By avoiding Functor, you no longer need fixed-points, and you can still have all the benefits mentioned in that first sentence.

Not Necessarily an Improvement

All of which is not to say that Patrick’s approach is worse than mine. There’s are two serious problems with my alternative.

The first, and most obvious downside of this is that we lose the support for compiler code generation. By using my own Endofunctor class and its associated efmap, I now have to write the tedious, error-prone traversal code myself. And remember, this is where we came in: one of the principle benefits of the approach Patrick shows was getting the compiler to write this code for us.

There’s also the fact that nothing in the world of Haskell knows about this custom Endofunctor class, unlike the extremely widely supported Functor class. So my solution will not play as nicely with the rest of the Haskell world as Patrick’s.

(Both these problems would go away if something like Endofunctor were a standard part of Haskell, and was supported as a code generation target just like Functor. So it’s tempting to say that the real problem here is Functor, but as a mere dabbler in Haskell, I think that’s a significantly stronger conclusion than I’m qualified to draw.)

So the bottom line is that Patrick’s approach is probably worth the pain. But I believe I have shown that the basic idea of recursion schemes is not in fact dependent on either the Functor class or fixed point types.

So my point is not that Patrick’s way of doing this is wrong. My point is that he oversells Functor and fixed points. Fixed points turn out to be a necessary evil imposed by the (in this case) unwanted generality of Functor. It’s still very clever, though.

When to Stockpile Computer Parts?

Friday 17 January, 2014, 08:17 AM

This is the end of the golden age for those of us who like to buy computers.

It’s all Apple’s fault. A significant force at the start of the personal computing revolution, Apple is now instrumental in its destruction. Thanks to the path beaten by the iPhone and iPad, the computer industry now understands how to make phones and tablets that do most of the jobs for which individuals used to buy a computer. This development of ‘just barely good enough’ tablets and phones has meant that demand for new PCs is falling drastically.

This has some negative consequences. Those of us who need (or merely want) a proper computer are going to find that it looks increasingly less like a commodity, and more like an exotic piece of specialist kit. In other words: PCs will get much more expensive. (Falling demand leading to higher prices may seem to contradict some conventional economic ideas, but this is an industry that relies heavily on economies of scale.) This will come as a nasty shock after several decades of decreasing prices and increasing performance.

And just in case it’s not obvious, ‘PC’ includes Macs here. Around a decade has passed since Apple last created Macs with truly original technical designs. The economic logic of building around commodity PC components was inescapable. In 2006 I bought two new laptops, one Mac and one PC, and out of curiosity, I did a little investigation. I enumerated all of the internal components making up each system. The Mac looked much fancier on the outside, but its innards turned out to be very nearly identical to my Dell’s. Apple instead specialised in areas where they could still charge a premium: physical design, the usability of their software, and things that aren’t PCs. Their profits illustrate the wisdom of this strategy.

The fact that Macs use commodity PC components (albeit mounted on Apple-designed circuit boards in beautiful boxes) means that the collapse of the PC market will raise Mac prices too.

Price hikes for personal computers of all kinds are inevitable because it just won’t be economically viable for parts suppliers to produce the necessary components at the scale that has historically enabled them to become so cheap.

It’s Happening Now

If you’re wondering how soon this will be upon us, be aware that Intel has recently decided not to complete a massive new chip factory despite having done most of the building work. Weak demand means that even though they’ve put a lot of money into this site, it’s just not worth the remaining investment it would take to bring the plant online.

If you were thinking of buying or building a new computer in the next couple of years, that factory may well be where its most important components would have been built. But not now.

It’s Not Happening All at Once

In some ways, the change will seem gradual. For one thing, PC components have a surprisingly long life, because today’s high-end parts often become tomorrow’s cheap bits. Designs remain on the market in various guises for much longer than you might guess from the inflated rhetoric about how fast things move in the computing industry. But if you build your own systems, you will probably see it sooner rather than later.

Some things will remain cheap for a good while. Anything needed in either a phone or a tablet will continue to be worth manufacturing in high volume. For example, graphics processors just capable enough to drive a full-size tablet panel with retina resolution will be a mass market component for a long time yet. And there’s not going to be any shortage of cheap ARM-based system-on-a-chip components either. This suggests that laptops will be affordable for a good while, because there will be a cheap way to make them: bolt a keyboard to a tablet. (Although quite what will happen to Intel’s venerable processor architecture in this new world remains to be seen.)

The downside is that over time, it will become very expensive, or even impossible to buy a laptop whose performance is much better than a tablet. And given the average tablet’s performance, that’s not an encouraging prospect.

Server components will probably be less affected. The increasing move of computing power out of the hands of users and into the cloud means that the anti-PC trend won’t have an adverse effect in the world of the server any time soon.

The first problems will hit those of us who like desktop systems, not least because the unique merits of desktop computers are not all that widely understood. If you’re in the know you can, for now, get a substantially better system if you don’t mind it being tied to a desk and cumbersome to move around. The desktop’s advantages can be particularly acute if you build them yourself.

I built a new desktop about a year ago. It replaced a system I built about 4.5 years earlier, which still outperforms a lot of newer computers I’ve had the misfortune to use. If you build your own system, you can build it to last—by choosing carefully, you can create something which, with the odd minor upgrade, will be serviceable for years. But you do that by picking components that are fairly close to the bleeding edge. And that’s where we’re going to see costs rise first.

To be more accurate, what we will probably see is costs failing to come down. There has always been a hefty premium for the very best components at any time, so I tend to pick CPUs and motherboards that are a notch down from the top of the range. This saves hundreds of dollars, with only a very slight reduction in capability. I’m expecting, sadly, that this early adopter premium will become significantly stickier. Soon, saving a couple of hundred will drop you not just a few months behind the curve, but an entire year or more.

At Least Computers Stopped Getting Faster

About the only positive aspect of all this is that we no longer need to upgrade quite so often. For all that futurologists love to talk about Moore’s law, the fact is that performance improvements have very nearly ground to a halt, compared to 15 years ago. Yes, Moore’s law is still in play, but that doesn’t mean much in practice: CPU transistor counts double every couple of years, but this brings relatively small performance improvements.

Moore’s law was never the main force behind the exponential performance improvements we used to enjoy. The two phenomena were merely correlated for years. In practice the improvements came almost entirely from exponential increases in transistor switching speeds. The correlation was down to a common root cause: shrinking transistors. Each new generation of silicon fabrication technology made transistors smaller. This enabled more to fit on a chip (fulfilling Moore’s law) but it also had the happy side effect of letting them work faster. But round about 2003/2004, for various reasons these size reductions stopped producing speed increases. They still enabled Moore’s law to hold, but this has simply demonstrated that Moore’s law wasn’t the main thing delivering ever faster computers. (To be fair, Moore’s law came in handy back in the good times: it enabled us to build in the caches that helped to exploit the speed improvements. Without Moore’s law it would have been harder to make practical use of the increased processing power. So you need both phenomena to drive up performance; Moore’s law on its own doesn’t do it.)

Strictly speaking, Moore’s law is providing more power, but not in a form that provides much of a speed boost in practice, for most applications. The power comes in multi-core CPUs, offering parallel processing capabilities of the kind that was once the preserve of exotic, specialized systems. But a quick glance at the 8 CPU load meters on my computer’s task manager when it’s busy shows me that the vast majority of tasks have no use for this parallelism. Only certain specialized tasks can do anything useful with this particular kind of power, something we’ve known since 1967. Amdahl’s law predicts this problem, and most computing tasks look more like the line at the bottom of the first graph in that page than the one at the top.

Apparently the press hasn’t noticed that the exponential performance improvements which are supposed to be ushering in a ‘singularity’ petered out about a decade ago. Even within the computing industry, an embarrassing number of people have also failed to spot this fundamental fact.

All of this may not exactly sound like a positive—it means computer upgrades are no longer as exciting as they used to be, which in turn is a significant factor behind the collapse in demand that’s causing the whole problem in the first place. We’re learning to make do with less, and the smartphone and tablet revolution is part of that. But it’s also why it’s now entirely possible to make a desktop PC that will be useful for 5 years. (And of course, I realise that by taking advantage of this, I’m part of the problem.)

The extended useful life of a computer means component failure is now a major factor in the need to replace equipment. Two decades ago, computer components may as well have been immortal, because (except for hard drives) they usually became obsolete years before they stopped working. And this brings me onto the question behind this post: should I start stockpiling computer equipment?

The Last Affordable Motherboard

We have a vicious circle. Slowing performance improvements (and, soon, rising prices) will push down demand. This reduces opportunities for economies of scale, so prices will rise. Manufacturers will have less incentive to spend billions pursuing ever more elusive performance gains, so improvements will not only become more expensive, they will be increasingly slight. Add ‘good enough’ tablets and phones to the mix, and all you’re left with is increasingly infrequent computer purchases by software developers and other geeks like me. By that time, computers will be as expensive as any other highly specialized technical equipment.

You know that fully-loaded Mac Pro (the new one that looks like evil R2D2 or a high-tech dustbin) that seems so absurdly expensive? Get used to it—5-digit price tags are the future (although the industrial design is probably all downhill from here).

This suggests that there’s going to come a point where it makes sense to buy enough motherboards and other basic components to last a lifetime before the prices get too high. If the only thing I had to worry about was age-related failures, I could reasonably afford to buy all the motherboards and assorted components I’ll ever need. (I’m assuming here that failure is connected to hours of use rather than total elapsed age, and that a component that has sat on the shelf for 20 years is as good as new. Most electronic failures tend to be physical in nature, caused either by heat cycles or, in the case of moving parts—fans and non-SSD drives—plain old wear and tear. And the relatively short lifetime of SSDs is also directly related to how much you use them, rather than how much time has passed. So excellent shelf life seems like a reasonable assumption, but I can’t prove it of course.)

The only problem with this is that components are still getting better—progress has slowed but it hasn’t stopped entirely. If I had bought 10 motherboards 5 years ago, I’d be annoyed, because things have moved on enough that I really don’t want a computer built around something of that era. But the useful lifetime does keep going up, so sooner or later I’ll buy a brand new motherboard that’s not very much worse than the very best one I’ll ever be able to buy.

I don’t think we’re there yet, and I’m not quite sure when that time will come. I’d like to believe the time to start stockpiling is a good 10 years away. But then in 2003 I naively thought that we probably had a good 10 years of exponential clock-speed-driven performance increase yet to come, when in fact it was already pretty much game over. So I suspect that 10 years is optimistic.

The hour is later than you think.

Ten Years

Tuesday 7 January, 2014, 04:15 PM

This blog has been running for ten years as of today!

I’m only counting elapsed time, of course. It ground to a halt a couple of years ago—there are no entries for 2012! In my defence, my first child was born around that time, and I also moved house twice. Oh, and I wrote a book—a complete rewrite of Programming C#.

(Apparently I didn’t get around to blogging about the book! Oops. Well better late than never. If you happen never to have looked at it, it’s a very different animal from the previous editions of that title. I wrote the book I would want to read, so it’s a book for people who already know how to program, and are looking for a lot of in-depth information.)

Anyway, even though I can’t claim 10 years of continuous blogging—it’s more like about 8.5 years spread across a 10 year interval—I didn’t want to let this anniversary go completely unmarked. So here are some random thoughts inspired by this decimal milestone.

Homebrew Blog Engine

The code that serves up this blog may have enjoyed greater longevity than anything else I’ve written. (I could be wrong—I worked on various embedded systems in the 1990s and early 2000s which might, for all I know, still be going. But it’s certainly the longest running software for which I’ve been responsible over its entire lifetime. I may even add support for comments any decade now.) I wrote it back at the end of 2003, partly as a way of learning about ASP.NET, and partly because I was deeply unsatisfied with the URL handling that most blog engines of the time offered.

I’ve been using the code I wrote back then with only minor modification ever since. It has moved servers a couple of times—initially it was on a shared ASP.NET host, then on a dedicated Windows 2003 server up until last year (!) when I finally moved it over to Azure. There were a couple of minor modifications to upgrade from .NET 1.1 to 2.0, and then to 4.5, but not much else has changed.

Loose Ends

I’ve just been browsing through the archive. (You can see entire years at a time by cropping URLs by the way, e.g. http://www.interact-sw.co.uk/iangblog/2004/ http://www.interact-sw.co.uk/iangblog/2005/ etc.) I realise I never answered the question I posed in http://www.interact-sw.co.uk/iangblog/2004/05/03/angryspacebaboon so, better late than never: this was an attempt to render something similar to Adobe illustrator gradient meshes as a triangle mesh in DirectX (with wireframe mode enabled so I could inspect it more closely). At the time I was frustrated by the relatively small repertoire of gradient fill types in WPF (or Avalon as it was called back then). Mind you, more recent versions of XAML have seen fit to reduce the options further still!

Foolish Predictions

I notice that in 2004 I was rash enough to make a prediction. In http://www.interact-sw.co.uk/iangblog/2004/05/20/endofmooreslaw I referred to Craig Andera’s prediction (a year or so before Herb Sutter’s The Free Lunch Is Over article) that the exponential speed increases we’d hitherto enjoyed in computing would shortly be ending. He got that right. With the exception of highly parallelizable work (e.g. video encoding) that can exploit all your cores (8, on my current machines), the speed increase with each new computer purchase has been marginal since that time. Up to around 2003/2004, each new computer was around twice as fast as its predecessor, and you felt a profound difference every time. But these days, a new machine feels like a very minor upgrade. (Strangely, the popular press still hasn’t noticed this. I still see frequent references to exponential speedups, particularly in the context of predicted ‘singularities’ in which AI becomes intelligent enough to build better AIs, and the machines take over. The fact that computers settled into the S-shaped fate of all technologies around a decade ago has done nothing to dampen these exponential dreams.)

But that was Craig’s successful prediction. Mine was that spinning disks with their very slow seek times would be superseded by technology that could respond orders of magnitude more quickly, and that this would be game changing. Well I was maybe as much as half right. I was a relatively early adopter of SSDs, and have considered them to be a non-negotiable feature of any system I buy for about 5 years now. And these do indeed have seek performance several orders of magnitude faster than conventional spinning disks. However, these don’t seem to have been the game changer I anticipated. It would be closer to the truth to say that these have provided one last speed boost that feels similar to what I used to get every two years back when we could expect to ever-faster clock speeds. It was good, but not the fundamental shift I was expecting. And depressingly, there’s nothing on the horizon promising any similar kind of improvement any time soon.

Moreover, storage latency is at least a big a deal as it always was, mainly because these days, your important storage is probably somewhere else. For code running in the cloud, you’re likely to be using a storage service that’s a network hop away. And even for client-side code, chances are your data’s home is some service out on the internet. These storage mechanisms often have latency that make 1990s hard disks look fast.

What SSD has provided, the Internet has taken away.

A Personal Favourite

I thought I’d finish by posting a link to one of my favourite posts. I really enjoyed writing this one: http://www.interact-sw.co.uk/iangblog/2004/12/16/movingpictures which is about some of the subtle aspects of high quality video rendering.

Async, await, and yield return

Friday 29 November, 2013, 06:06 PM

Sooner or later, it seems to occur to a lot of C# developers to try something like this:

// NOTE: doesn't work!
public async Task<IEnumerable<string>> GetItemsAsync()
    string item1 = await GetSomethingAsync();
    yield return item1;

    string item2 = await GetSomethingElseAsync();
    yield return item2;

This basic idea showed up recently in a question on Stack Overflow. I answered it, but since this is a recurring theme, I thought I’d write a blog post.

That code attempts to combine two C# features: iterators, and asynchronous methods. There are similarities between these: both enable you to write straightforward-looking code, which the compiler then tears apart and rewrites into something that would have been at best difficult, and often horribly contorted to write by hand; both let you write methods which return part way through execution, but are able to resume execution later on. However, you can’t use both features in the same method.

Superficially, this seems like a reasonable thing to want to do—an asynchronous method can perform any number of asynchronous operations before eventually producing a value, so surely it’s just a small leap from there to an asynchronous method that can produce several values, exposed as a sequence?

There are two problems here, though. The first is that the attempt shown above is slightly wrong-headed. It misunderstands the nature of the types involved—the method signature is not the right way to express the intended semantics. The second problem is that even if you fix this first problem, the compiler doesn’t actually know how to do what you want. Fortunately, it turns out that you don’t really need special compiler support for asynchronous lists—it turns out to be possible to use the existing non-list-oriented asynchronous support in conjunction with a library to get the desired effect.

If you’ve read much of my blog over the past year, or if you’ve read the latest edition of my book, Programming C# (which was a complete rewrite, by the way), it probably won’t surprise you that I’m going to suggest solving this problem with Rx (the Reactive Extensions for .NET).

But first, why is that example wrong-headed?

Representing Asynchronous Lists

Let’s be clear about the intent: the code above wants to provide the caller with a sequence of items, but it can’t necessarily produce all the items immediately. It needs to do some work to determine what values to return. As you can tell from the fact that it calls methods that end in Async, and that we’re having to await their results, it might take some time for the information to become available. So this code wants to be able to produce each item when it’s good and ready.

But that’s not what the method’s return type promises:


This says something subtly different: the method will produce its result—a sequence of items—asynchronously. It sounds like a rather picky distinction when you describe it: returning an asynchronous sequence vs. asynchronously returning a sequence. But it’s actually a rather big difference in practice.

The Task<T> type (regardless of what T may be) represents an operation that produces a single result. When you launch a task, its result is not available until the task completes. And once the task has produced its result (i.e., completed) that means it has finished—it is no longer executing. That’s actually very different from what we’re trying to do here: we want to be able to produce a value, and then maybe another value some time later, and perhaps another value later on, and so on.

So the problem with Task<IEnumerable<T>> is that the work has to be complete before it can give you anything. This does not represent what we need—we want to be able to supply multiple values, on whatever schedule we like.

Of course nothing technically requires an enumeration to be able to return items immediately. When consuming code starts retrieving items (e.g., with a foreach loop) the IEnumerable<T> implementation can just block until it has something to return. So it is possible to write an IEnumerable<T> implementation that produces results on its own schedule. And if we’re going to do that, there’s not really much benefit in returning a Task<IEnumerable<T>>—we may as well do this:

// NOTE: also doesn't work!
public async IEnumerable<string> GetItemsAsync()
... same body as before ...

That’s a slightly more appropriate signature (although as we’ll see shortly, it’s not the best approach). But the compiler doesn’t like this either—it won’t let you use async on a method with an IEnumerable<T> return type. And we wouldn’t really want it to—IEnumerable<T> can only be consumed synchronously, so support for an asynchronous implementation would offer limited benefits.

In any case, we don’t need special compiler support to implement an iterator that blocks until it’s ready to produce something—we can just write a synchronous version:

public IEnumerable<string> GetItemsAsync()
    string item1 = GetSomethingAsync().Result;
    yield return item1;

    string item2 = GetSomethingElseAsync().Result;
    yield return item2;

However, this is unsatisfactory—presumably the reason for attempting to use async and await in the first place was to take advantage of their potential efficiency improvements—they enable us not to have to block. Here, we’ve just given up, and reverted to synchronous code. (A task’s Result property blocks until the task completes.) And even if the compiler was prepared to help us, we’d just run straight into the problem that IEnumerable<T> doesn’t provide a way to consume items asynchronously. Fortunately, there’s a better way.

Instead of using IEnumerable<T>, we can use its push-oriented dual, IObservable<T>. It represents exactly the same underlying abstraction—a sequence of items—but it enables the source to decide when to produce items, rather than letting the consumer be in charge. This makes it the more appropriate of the two representations to use if you want to be able to support asynchronous item production.

I think it can be helpful to think of the relationship between these two interfaces by way of an analogy:

public string Foo()

is to

public Task<string> FooAsync()


public IEnumerable<string> Foos()

is to

public IObservable<string> FooSource()

The first method returns a string when you ask for it. Its asynchronous equivalent (the second method) returns a Task<string> instead, enabling it to provide a string when it’s ready. And then we have the sequence-based versions. The third method returns IEnumerable<string>, which provides strings when the caller asks. Finally, we have its asynchronous equivalent, returning IObservable<string>, meaning that it can provide each string when it’s ready.

(Unlike with Task<T>, there’s no standard idiom for naming methods that return IObservable<T>. I’ve arbitrarily appended Source to give it a different name from the one that returns an IEnumerable<T>. Don’t be tempted to put Async on the end by the way—that’s the naming convention for the Task-based Async Pattern, or TAP. It’s likely to confuse people if you return IObservable<T> when they’re expecting a Task.)

This relationship between synchronous and asynchronous method return types tells us that when someone writes Task<IEnumerable<T>>, chances are that the concept they’re trying to express is really IObservable<T>.

So, you might think we could do this:

// NOTE: doesn't work either!
public async IObservable<string> GetItemsSource()
... same body as before ...

This is a reasonable thing to want to be able to do. As it happens, C# doesn’t support it—it only supports void, Task, or Task<T> as return types of async methods. However, it doesn’t really matter—Rx makes it pretty easy to do what we want.

Implementing with Rx

Here’s how to do it:

public IObservable<string> GetItemsSource()
    return Observable.Create<string>(
        async obs =>
            string item1 = await GetSomethingAsync();

            string item2 = await GetSomethingElseAsync();

Rather than relying on compiler support, I’ve simply had to use a library function supplied by Rx, Observable.Create. This takes a delegate as its argument, and you can write that as an asynchronous method if you want. That’s what I’ve done here—it has enabled me to use almost exactly the same code as I wrote in my original (non-compiling) example.

In particular, I am free to use await expressions in the code that produces the sequence’s items. And remember, that was the original goal.

Instead of using yield return, I just call OnNext on the object Rx supplies as an argument. (It provides an IObserver<T>, by the way, which is the counterpart of the synchronous IEnumerator<T>.) Strictly speaking, I should then tell it I’ve reached the end of the sequence by calling OnCompleted before finishing, but Rx detects such sloppiness, and completes the sequence for you.

Would it be slightly cleaner if the compiler generated the extra code here for me? Yes—I’d be able to write this:

// NOTE: doesn't work!
public async IObservable<string> GetItemsAsync()
    string item1 = await GetSomethingAsync();
    yield return item1;

    string item2 = await GetSomethingElseAsync();
    yield return item2;

Would that be so much more convenient that it’s worth adding a new language feature? Probably not—this is absolutely trivial compared to the amount of heavy lifting the compiler does to enable async and await.

So the reason we don’t have specialised compiler support for asynchronous methods that produce sequences of items is that we don’t really need it. The single-result asynchronous method support combines well with library support (from Rx) to enable a pretty good solution.

I Want my IEnumerable<T>

But what if you really wanted an IEnumerable<T>? Perhaps you want to use async and await, and you want to enable clients that can consume an IObservable<T> to work efficiently with your code, but you also have some existing code that requires an IEnumerable<T>?

Well it turns out you can provide both. Rx provides a very straightforward way to transform an IObservable<T> into an IEnumerable<T>:

public IEnumerable<string> GetItemsAsEnumerable()
    return GetItemsSource().ToEnumerable();

This may work slightly better than the simple synchronous approach (in which I just used the Result property of the tasks returned by the asynchronous methods), because Rx enables the source to run ahead of the consumer. As soon as code starts to enumerate the contents of the IEnumerable<string> returned by this method (e.g., by starting a foreach loop), the code that generates the items (the asynchronous anonymous method in the last example of the preceding section) will start to run, and it will be free to generate items as fast as it likes, regardless of how quickly the calling code retrieves items. (The IEnumerable<T> implementation Rx supplies here has an internal queue to support this.) So you can potentially get a higher degree of concurrency with this approach than the straightforward synchronous technique. If you’re really lucky, by the time the consuming code finishes processing an item, the next one will already be available.

That said, you’ll get better results if your consuming code understands IObservable<T>, because if the consumer gets ahead of the producer, it would end up blocking a thread if it’s using IEnumerable<T>, whereas with IObservable<T>, you’ll only hang onto a thread for as long as you have productive work to do.

Azure Startup Script Admin Detection

Thursday 7 November, 2013, 11:39 AM

Since version 2.1 of the Azure SDK shipped, it has been possible to develop Azure applications without needing to run Visual Studio as an administrator. I used to run as a non-admin user back when Windows XP was still the current version of Windows. If you’ve ever tried that, you’ll know it required some determination—I really hate running anything elevated for everyday work. So the ability to debug Azure code without elevation was a very welcome change.

However, it caused a slight snag. When running the Azure Emulator Express (which is what enables local Azure web and worker role debugging without needing to run as an administrator), it will run all your roles’ startup scripts, including all the ones marked as executionContext="elevated". Of course, since the emulator isn’t running elevated, it can’t run the script elevated. So the script runs, but it doesn’t get the elevation it requests. This can cause these scripts to fail, preventing the role from starting.

For example, I have web role with a script that installs a particular certificate in the system’s trusted root CA store. Originally, it contained just this one line:

%windir%\system32\certutil.exe -addstore root %~dp0my-root-ca.cer

However, this caused a problem when I switched to the Express emulator—if you attempt to run that command non-elevated, it will fail, returning a non-zero return code. The Azure emulator detects that, so your role fails to start. (Rather unhelpfully, Visual Studio doesn’t tell you that—it just sits there claiming that it’s starting the roles until you give up and click Cancel.)

This particular work—adding the relevant certificate to my computer’s root CA list—only needs to be done once on any particular computer, so I don’t actually need this script to run when I’m debugging. I need to elevate when installing the cert, but since I only do that once, I don’t need to elevate every time I debug.

Ideally, there would be some sort of environment variable passed in to the script that would enable you to detect that you’re running in the emulator. Unfortunately, it doesn’t look like there’s any such thing. But we can instead fix this by detecting whether we’re elevated, and only attempting to install the certificate if we are.

Unfortunately, my first attempt at this didn’t work:

whoami /groups | find "S-1-16-12288" > nul
if "%errorlevel%"=="0" (
    %windir%\system32\certutil.exe -addstore root %~dp0my-root-ca.cer

That first line is the commonly recommended way to detect elevation in a .cmd file—you’ll find plenty of examples using this basic technique on the web. It relies on the Integrity Level mechanism that was introduced in Windows Vista. A token will always contain one of four SIDs indicating the integrity level at which it is running. That S-1-16-12288 SID is present when an admin user is running elevated. (The display name for that SID is “Mandatory Label\High Mandatory Level”.)

However, it turns out that Azure gives you more than you asked for. If your script requires elevation, apparently it runs at an even higher level: System. That’s the integrity level used by privileged system services, e.g. a Windows Service running as LocalService.

The script will be able to do anything that it would had it merely been running at the High integrity level, but the common test will mistakenly conclude that we don’t have administrative privileges. The upshot was that my script decided not to do anything, and so the certificate didn’t get installed.

This fixed it:

whoami /groups | find "S-1-16-12288" > nul
if not "%errorlevel%"=="0" (
    whoami /groups | find "S-1-16-16384" > nul 
if "%errorlevel%"=="0" ( 
    %windir%\system32\certutil.exe -addstore root %~dp0my-root-ca.cer

If the High integrity level label is not present, this then goes looking for the even higher System label, which has the SID S-1-16-16384 (or “Mandatory Label\System Mandatory Level” if you prefer display names). So this script will perform the work whether running with ordinary elevation or as system. This successfully installs the certificate when deploying to Azure, but skips that work when running non-elevated in Azure Emulator Express.

Now you might look at that and think that there’s a problem: when running non-elevated, the %errorlevel% variable will be non-zero by the time we reach the end of the script. And if you were calling this script from another script, then it would indeed appear to have failed. However, there’s a difference between a calling script inspecting the %errorlevel%, and the exit code eventually produced when cmd.exe exits. If a startup script terminates simply by reaching the end of the script, it doesn’t appear to matter what the %errorlevel% was at that point—from Azure’s perspective, that’s success. However, if a script just executes a single command that fails, cmd.exe appears to reflect that in its return code, and Azure sees it as failure.

That said, the document for Azure Startup scripts implies that this shouldn’t work:

“Startup tasks must end with an errorlevel (or exit code) of zero for the startup process to complete. If a startup task ends with a non-zero errorlevel, the role will not start.”

So to be on the safe side we should explicitly exit with a code of 0 when we decide not to do anything:

whoami /groups | find "S-1-16-12288" > nul
if not "%errorlevel%"=="0" (
    whoami /groups | find "S-1-16-16384" > nul
if not "%errorlevel%"=="0" (
    EXIT /B 0
%windir%\system32\certutil.exe -addstore root %~dp0my-root-ca.cer

Arguably this has another benefit: it makes it slightly clearer that we are deliberately exiting without producing an error if we don’t seem to be running elevated.

Fixing APPX2102 Build Warnings

Monday 29 July, 2013, 08:57 AM

I’m working on a Windows Store application, and I’ve set up an automated build process using TeamCity. Although builds were completing successfully (in the sense that I was able to install and run the results) I was seeing these warnings in my build log:

[GenerateAppxManifest] C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v11.0\AppxPackage\Microsoft.AppXPackage.Targets(808, 9): warning APPX2101: File 'VisualStudioEdition' is not found or is not an executable file.

This happened inside the _GenerateCurrentProjectAppxManifest step.

It seems that this is not unique to TeamCity. As this MSDN forum post shows, you can get the same warning on a TFS build agent.

In fact, this seems to have nothing much to do with build agents per se. You can reproduce the problem simply by running the build from the command line, e.g.:

msbuild /t:Rebuild MySolution.sln

I find this reproduces the same warning on my desktop machine, which is able to build the project without warnings from within Visual Studio. (And likewise, if I log into my build server with Remote Desktop, using the same login as the build agent, I can build the project without warnings from within Visual Studio. I only see the problem either on automated builds, or if I launch MSBuild from the command line.)

The Problem

As far as I can tell, when you build from within Visual Studio, it sets the VisualStudioEdition property to the name of the edition you have installed, e.g. Microsoft Visual Studio Ultimate 2012. But when you build from the command line, this does not get set. And it would appear that even if you use TeamCity’s “Visual Studio (sln)” build runner, it also does not set this parameter. (TeamCity offers both “MSBuild” and a “Visual Studio (sln)” build runners. You typically use the latter if you’re building something that requires Visual Studio to be installed, e.g. a Windows Store app. But apparently, it doesn’t quite recreate the exact same build environment you’d get if you load the project into Visual Studio and build it from there.)

The build step for creating the application manifest seems to generate a warning if this particular property is not present. It doesn’t really seem to need it, because the build works despite the warning. But it’s unsettling, and also annoying if you like a clean build. And perhaps something subtle can go wrong when this build step doesn’t have this information.

The Fix

You can supply the build with the information it wants by setting the property in the MSBuild command line. You can do that with the /p: argument, but since I’m using TeamCity, I’ve gone down the recommended path of adding an entry under “System Properties” in the Build Parameters. I’ve added a system.VisualStudioEdition property, which I have set to the name of the edition of Visual Studio installed on my build server. This causes TeamCity’s build runner to pass that via a /p: switch when it runs MSBuild. As a result, my automated builds now run without warnings.

Returning to a XAML Windows Store App

Wednesday 24 July, 2013, 07:35 AM

Last time, I described the various events and methods that come into play when the user leaves your Windows Store application. In this second part, I’m going to describe what happens when the user returns. Not all of the notifications are guaranteed to occur—it depends on various factors. If the user merely switched focus to a different app without removing yours from the screen, you’ll see considerably less activity when the user returns than if your application had left the screen. And much also depends on whether Windows got as far as suspending and then terminating your application while it was off the screen.

In that last case, where your application was suspended and then shut down, you will see the following sequence when the user switches back:

  1. Launch with ApplicationExecutionState.Terminated

  2. OnNavigatedTo

  3. LoadState

  4. VisibilityChanged (becoming visible)

  5. Activated

The second and third of these only happen as a result of how the Visual Studio templates for Window Store apps work. The LoadState method isn’t part of WinRT—it’s a member of the LayoutAwarePage base class that the template defines. And although OnNavigatedTo is part of WinRT, it doesn’t run by default on a restart. It’s only called as a side effect of some work done by the templates.

If Windows suspended your application but kept it in memory, you will see the following events when the user switches back:

  1. Resuming

  2. VisibilityChanged (becoming visible)

  3. Activated

If Windows did not get around to suspending your application before the user switched back to it, you’ll just see the following.

  1. VisibilityChanged (becoming visible)

  2. Activated

And if your application remained on screen while the user switched to a different application (e.g., because your app is sharing the screen with another, snapped app) you’ll just get the Activated event.

Activated is therefore the only event you get in all cases. And you get VisibilityChanged in all cases where your application had been off the screen.

I’ll now describe the various events and methods in more detail.

Launch with ApplicationExecutionState.Terminated

If Windows suspended and then terminated your application, it will need to re-launch it if the user tries to switch back. Remember, when Windows terminates an app to make more memory available for the foreground application, it hides this from the user. The application still appears in the Alt+Tab list, and in the list that appears when you swipe down from the top left (which you can also show by typing WindowsKey+Tab). Windows wants to maintain the illusion that terminated applications continue to run in the background, so it has no choice but to start a new instance when the user tries to return to an app that isn’t really still there. Windows lets you know that this is happening through the PreviousExecutionState property of the LaunchActivatedEventArgs passed to your OnLaunched method. This will have a value of ApplicationExecutionState.Terminated if your application is being launched because the user wants to return to it.

When your application is re-launched in this way, you should put things back exactly how they were when the application was suspended. This may entail loading application data, and will typically also involve restoring superficial state such as which page the user is on, and where lists are scrolled to.

The Visual Studio templates provide code that restores some of the superficial state for you. It generates code in OnLaunched that calls the RestoreAsync method of the SuspensionManager class (also provided as part of the template). This in turn calls the SetNavigationState method on each Frame your application registers. (The templates register just a single frame by default—it acts as the root of your whole UI. But it’s possible to have multiple frames each with its own navigation stack.)

Recall from last time that the templates arrange to call GetNavigationState during the Suspending event. That method returns a string representing the state of the navigation stack, and the template saves this to disk. When the templates pass that string back to SetNavigationState, it restores the stack. So this will ensure that whichever page the user was on last time, they’ll still be on that page now even though the application restarted. Moreover, this preserves the back/forward stack.

As we also saw last time, GetNavigationState has the side effect of calling the current page’s OnNavigatedFrom method. Likewise, SetNavigationState will call OnNavigatedTo on the object it instantiates to represent the current page.


WinRT invokes a page object’s OnNavigatedTo when the user navigates into it, whether using Back or Forward buttons, or just tapping or clicking on something that navigates to a new page. Left to its own devices, this is the only situation in which it’ll call the method. But if you’re using the Visual Studio templates, it will also be called during a restart following silent termination of an application, for the reasons described in the preceding section.

There isn’t any easy way for the OnNavigatedTo method to know whether it is being called due to intra-app navigation, or an application restart—the event argument reports a NavigationMode of Back for restarts, the same as for a normal intra-app backwards navigation.

In general, if the navigation mode indicates some sort of return (either Back or Forward, not New or Refresh) your application would be expected to restore the superficial user interface state to whatever it was the last time the user was on this page. You need to do this for either intra-app navigation or a relaunch, so the distinction between those two situations doesn’t matter; if you care about the distinction you should use a different notification.

Be aware that you will only see this notification either for intra-app navigation, or in cases where the application was terminated after suspension. This method does not get invoked when the application resumes after suspension.

If you’re using the Visual Studio templates, in practice you’ll typically rely on its support for persisting and retrieving data in the back, forward, suspend, and relaunch scenarios, and will instead override the LoadState method.


The LayoutAwarePage base class supplied by the Visual Studio templates defines a virtual method called LoadState. It calls this from its OnNavigatedTo implementation, passing back data that you provided in the most recent call to your page’s SaveState method, which I described in the previous blog post. (It will only call LoadState for Back or Forward navigation by the way.)

The advantage of using SaveState and LoadState over the underlying OnNavigatedFrom and OnNavigatedTo methods is that the templates handle the work of saving data out to disk during suspension, and loading it back in during a restart.

For intra-app navigation, you will get SaveState and LoadState notifications immediately and consistently as you navigate away and back, but for suspension, resumption, and restarts, it is slightly less consistent. First, there’s a delay—SaveState will be only called if the user switches away from your app for long enough to trigger suspension. Even when that does happen, you won’t necessarily get a corresponding call to LoadState. That happens on a relaunch, so it requires Windows to terminate your process after suspending it (and it turns out to do this fairly infrequently—it’s very common for it just to keep suspended applications around in memory for a long time). Although I can now see the logic of this, it surprised me at first—I had initially assumed that I’d get a LoadState whenever the user returned to the app.

The three notifications discussed so far only happen when the app is re-launched after Windows has silently terminated it. The next is for a different scenario.


If the user switches away from your app for long enough that Windows decides to suspend the app (meaning you’ll have seen the Suspending event described in the previous post) and if the user later returns to your application without Windows having terminated the process in the meantime, your process will be allowed to continue running, and the Resuming event will be raised to let you know what happened.

Your process will have been in suspended animation, so from the perspective of your code, it’ll look like the Resuming event was raised more or less immediately after the Suspending event. Time will have elapsed in between these of course, and your code would be aware of that if it looked at the system time for both events. But since your application will have been suspended, it won’t get to run any code in that gap, so as far as your application’s activity is concerned, these two events are adjacent.

You might not need to do anything in the Resuming event. One of the implications of receiving this event is that all the work you did to preserve your application’s state during suspension was unnecessary: your process remained in memory, so the state is all still there. That said, there is still sometimes important work to do on resumption. If your application shows information that needs to be kept up to date, you should check how long you were in suspension for, because you might need to refresh your display. (E.g., if your application shows upcoming train departure times, you might need to remove some or all of the trains from the screen, because some of them may already have left while you were suspended; if you were suspended for hours, they may well all have left.) But note that when your application does need to do something during Resuming, it will not be the mirror image of what happened during Suspending. (If you do work that mirrors the state saving done during suspension, it’ll normally be done during launch, possibly indirectly via the LoadState method.)

Just to be clear, if your application was suspended and then terminated, you will not see a Resuming event. Your application will be restarted, and OnLaunch will receive a PreviousExecutionState of ApplicationExecutionState.Terminated, as described earlier.

Core Window VisibilityChanged Event (Becoming Visibile)

When a user returns to your application after it has been off screen, you will always get a VisibilityChanged event from the core window. That’s true whether your application was suspended, and then resumed, or was suspended, terminated, then restarted, or even if the user flipped away and back again so quickly that Windows didn’t even have time to suspend your application.

Remember, you get the same event when becoming visible or invisible, with the difference being indicated through the event argument’s Visible property (which is why I qualify this event with either “becoming visible” or “becoming invisible” each time I refer to it in these two blog posts).

If you stopped any activity (e.g. video playback) when this event indicated that your application had become invisible, this is likely to be the right time to restart that work.

Core Window Activated Event (Activation)

No matter how the user returns to your application, the last notification you will see is the core window’s Activated event. If your application stayed on screen while the user moved their focus elsewhere (perhaps because they have multiple applications on screen simultaneously) then this will be the only notification you receive when the user returns.

If you need to change your application’s appearance in any way to reflect where the input focus is, you would need to handle this. And as Mike Taulty pointed out to me after my previous post, there may be situations in which deactivation is actually a better time to stop certain kinds of work than the moment of becoming invisible. For some games it might make sense to pause automatically as soon as the user moves the focus away, even if your application is still visible. If you’ve chosen to do that, then it might be appropriate to resume when reactivated.


I’ve put quite a lot of detail in the last two posts, so the most important point might be buried in the volume. So it’s worth highlighting:

It’s not all about suspension and restarts.

Proper suspension handling for Windows Store apps gets a disproportionate amount of attention in the documentation, and also in forums, blogs, and other community efforts. This is presumably because it’s a feature of apps on constrained devices that traditional Windows client developers won’t be used to. But it’s not the right place to hang certain kinds of logic, because suspension is delayed, and doesn’t always occur, and even when your app is suspended, there are two completely different paths by which the user can return to it. This means by extension that LoadState and SaveState are also not good places to put work that must be done when the user switches into or out of your application. The less widely discussed VisibilityChanged and Activated events of the CoreWindow class usually better for this kind of work.

50 Ways to Leave Your XAML Windows Store App

Thursday 18 July, 2013, 07:32 PM

A quick quiz for you: in a C# XAML-based Windows Store application, what’s the distinction between a page’s OnNavigatedFrom and SaveState methods, the application object’s Suspending event, and the core window’s Activated and VisibilityChanged events? All of these are typically invoked or raised round about the time a user switches away from your app, but why are there so many?

More importantly, which one should you use? In my experience, the answer is: not always the one the docs or web searches might imply.

There’s a similar and related plethora for when the user returns to an application. You might see any of following being invoked or raised: the page’s OnNavigatedTo and LoadState methods, the application object’s OnLaunched method and Resuming event, and the core window’s Activated and VisibilityChanged events (those last two being the same events that the core window raises when switching away, but with different arguments this time).

Sometimes you’ll get all of these events, but there are several variations on the themes of switching away from and returning to the app, and in some cases you only get a subset of the notifications.

OK, so there probably aren’t strictly 50 variations, and only about half of mechanisms I’ve just described are concerned with returning to the app, but I don’t think there is a song called 11 Ways to Leave or Return to Your Lover. My point is that there are lots of different bits of the API all doing quite similar things.

So What?

You might be wondering why anyone would care. Well, I happen to have a deep interest in how things works, but there are also good practical reasons to understand this stuff. For example, you might need to do something around the time that these events and methods come into play so that when the user switches away from your application, and then later returns to it, your application doesn’t present any unpleasant surprises.

In the resource-constrained world of tablet computers, unpleasant surprises are, unfortunately, the default. (Or at least, they would be if it weren’t for the fact that to get your app listed in the store, you’re required to do the work to handle most of the nasty cases correctly.) There’s quite a high chance that after the user switches away from your application, Windows will decide to terminate your process. This enables it to let whatever application the user is actually using to have the memory your process was occupying. The basic principle here is that the machine’s resources should be dedicated to whatever the user is doing now, and if that means summarily terminating processes the user’s not currently interacting with, then so be it.

Windows doesn’t tell the user, though. It acts as though your program is still there—it’ll be available from the list of tasks the user can swipe in from the side (or WindowsKey+Tab if you’re more of a keyboard user). If the user switches back to a program that Windows has already destroyed, it just launches a new copy, passing in launch parameters that tell the application that it was previously terminated and is now expected to carry on as if nothing had happened.

If you don’t take steps to handle this, you will likely upset the user. Your program must detect on startup that the user was unaware that you’d ever been shut down, and you must attempt to put everything back how it was before Windows pulled the rug out from under your feet. If you don’t write code to handle all this, the user will wonder why your application has inexplicably decided to return to its start page. (And you might not even get as far as having users—your app might be rejected from the store certification process.)

Consequently, it’s important for your application to be aware of when the user switches away and back, so that you can preserve the illusion of availability. And that’s why you might care about some or all of the many methods and events I described at the start of this blog post.

End of Days To Do List

Here are some things you might want to do in your application around the time the user switches to a different application:

There are some related processes when the user switches back:

There’s already a catch. You might not need to do the first and last items in this second list. Windows has the option to terminate your application, but it doesn’t always do it. In fact, I’ve been surprised by how long suspended applications remain in memory on my Surface RT. It’s rather frustrating if you’re trying to verify that your application works like it’s supposed to when it gets suspended and then evicted in a real-world scenario. (It’s easy to test this stuff from Visual Studio, but at some point you’ll want to manually verify that your app does the right thing outside of a test environment, and prompting a real-life post-suspend termination takes some effort.)

There’s a subtle variation on this—some of the work I’ve just described might be necessary even when your app remains in the foreground. If the user navigates to a different page within your application, you might want to save all your UI state so that you can restore things how they were if the user should return to the page—this involves the same work as the third step in each of the two lists above.

Consequently it’s common to have shared code that comes into play both when switching in and out of the app, and also when navigating between pages within the app.

Again, there’s a variation that can catch you out. In some situations, page objects will remain in memory as you navigate around, in which case it may not be necessary to do anything when the user returns to a page.

WinRT vs Templates

Before I talk about what each of the various events and methods does, and how they relate to the various tasks just described, it’s worth being aware that not all of the mechanisms are part of WinRT. Some come from the templates Visual Studio uses for new Windows Store projects. The features built into WinRT are: the page’s OnNavigatedFrom and OnNavigatedTo methods, the application object’s Suspending and Resuming events, and the core window’s Activated and VisibilityChanged events.

SaveState and LoadState are defined by the LayoutAwarePage base class that Visual Studio adds to most Windows Store projects. (The very simplest project type doesn’t have this class.)

Just to confuse matters, although the page’s OnNavigatedFrom and OnNavigatedTo methods are intrinsic to WinRT, the Visual Studio templates change how they work—left to its own devices, WinRT will only call these for intra-app navigation, but with most of the Visual Studio templates, you’ll also find that OnNavigatedFrom gets called during suspension, and that OnNavigatedTo is called if your application is relaunched as a result of the user trying to return to it after Windows terminated it. Somewhat inconsistently, OnNavigatedTo is not called if the user does exactly the same thing (goes away and then comes back) but Windows happened not to terminate your application in between.

Anyway, enough background…what do they all do, and what are they for?

What’s It All For?

I’ll work through the events and methods starting from a state when your application is running, working through to the point where the user has switched away. In a later post, I’ll describe what happens when the user returns.

Core Window Activated Event (Deactivation)

The first thing you’ll probably see when the user switches away is the core window’s Activated event. If that sounds backwards, it’s because this one event is used for both activation and deactivation. In this case, the event’s WindowActivatedEventArgs will have a WindowActivationState property value of CoreWindowActivationState.Deactivated.

Deactivation is arguably the twitchiest of the notifications, because your application might not be going anywhere. All this tells you is that the focus is no longer within your application. That can happen simply because there are two applications on screen using the ‘snap’ feature that lets you run Windows Store apps side by side. If your application has the focus, but the user then taps in another app that’s also currently on screen, yours will be deactivated. But it doesn’t mean you’re going anywhere—your application is still there, and the user can easily tap it, at which point it’ll be reactivated.

In fact, your app can be deactivated even when it’s the only app on screen! If the user brings up the list of other apps by hitting WindowsKey+Tab, the keyboard focus goes into that list, and your application gets deactivated. (If the user decides not to switch applications, you’ll get reactivated when they dismiss the app list.)

So there’s not much to be done during deactivation. You might conceivably want to change some colours on screen to indicate that you no longer have the focus. If you show temporary popups of the kind that should go away promptly as soon as the user shows an interest in something else, deactivation can be a good moment to dismiss them. But you don’t need to start getting ready for your world to go away.

Core Window Visibility Event (Becoming Invisibile)

The first real clue that your app is about to leave the screen (typically because the user has just switched to a different app) is that the core window raises its VisibilityChanged event, and the EventVisibilityEventArgs will return false from its Visible property.

This is very likely to be an event you’ll want to handle, although you wouldn’t necessarily know that from the documentation. Far more attention is given to the Suspending event I’ll be getting to shortly, but in a lot of cases handling visibility changes is a better bet, because you get this event the moment the user switches away from your application.

I’ve been working recently on an application that plays video, and the vast majority of advice I’d found on how to manage playback when the user switches in and out of your application talks about either suspension or activation. However visibility change events turned out to be a far better choice. (The problem incidentally, is that although the MediaElement fades down the volume automatically when the user switches away from your app, inexplicably it pretends that playback continues, muted, in the background. So when the user switches back to your app, it will seem as though the video had been playing silently all along, because it will have advanced to the same point it would have reached if you’d stayed in the app and watched it. Not only is this the default, you can’t really turn it off—the only directly supported alternative is to arrange for audio to continue even when your app leaves the foreground. If you actually want the behaviour that most users will expect (which is what the built-in Windows 8 video player app does) you’ll need to write code to pause playback when the user switches away. And the VisibilityChanged event is the only one that comes in at the right moment. The other popular suggestions just don’t work—if you pause on activation, you’ll end up stopping playback when your app hasn’t gone anywhere. And if you rely on Suspended, you’ll find that it fires many seconds after the user left your app, so you’ll end up missing some content.)

Not only does this event come exactly when you want it, it might be the only notification that the user has left your app—there’s no guarantee that you’ll even see a Suspending event, because the user might return to your app before Windows had a chance to suspend it. But you will always see visibility change events.

So in summary, this is a useful event: it is your only timely, dependable notification that the user has switched away from your application.


A few seconds after the user has displaced your app from the screen, Windows will usually decide to suspend it. This prevents applications that are not running from consuming CPU cycles, but by keeping apps suspended in memory, they can be restarted more swiftly than would be possible if applications were always terminated after switching away. The tricky thing about suspension is that it might be the last thing that ever happens to your app—once suspended, an app might never get restarted, and Windows may simply terminate it without further notice.

So the Suspending event is important for one reason: it is your last opportunity to save state persistently. Apps will typically have two goals here: 1) don’t lose the user’s data, and 2) present the illusion that the application was there in the background all along when the user returns to it, even if Windows did in fact terminate it to free up some memory. So as well as saving user data (whatever that means for your application) you’ll typically want to store more superficial stuff like remembering which page the user was on, what was selected, where lists were scrolled to and so on. Obviously, you should prioritise substantial data over the more superficial stuff—save the user’s edits, notes, high score (or whatever the important user-generated information is in your app) first, then move onto the UI state, just in case you run out of time while saving the data. (You’re only allowed 5 seconds.)

All this work to preserve UI state may be wasted—it’s quite common for suspended applications to resume execution, because the user may switch back to them. (That’s the whole point of suspension.) And since, in that case, your application is reawakened with all its memory intact, it typically doesn’t need to load back in any of the state that it carefully wrote out, because it’s all still there in memory. But you don’t know at the moment of suspension what’s going to happen, so you have to assume the worst, and save everything.

So in summary, Suspending is where you should ensure all user information has been saved, and then save whatever you need to be able to re-open the UI in the same state it’s in now. Be aware that you might not see this event, and if you do, it’ll typically be several seconds after the user switched away.

You might not necessarily handle the Suspending event directly, because the templates provide you with this notification indirectly.


If you’re using the suspension manager and navigation infrastructure provided by all but the most basic of Visual Studio’s store app templates, the next thing to happen will be that the current page’s OnNavigatedFrom method will be called. Left to its own devices, WinRT only calls this method when a Frame navigates away from one page (just before navigating into the target page). This method is mainly concerned with navigation between pages within your app, so in general, you’d never expect to see this when a user switches away. Your app continues to be on the same page, as the user will see if she ever switches back. So there’s no sense in which your page is really being navigated away from. Nevertheless, you’ll see this method get called.

The reason it happens is that the Visual Studio templates generate a handler for the Suspending event, and in that handler, they automatically persist the current navigation stack—they store a record not just of which page you’re on now, but also everything in the back/forward history for the frame. They do this by calling the Frame class’s GetNavigationState method, and that turns out to have the side effect of calling the current page’s OnNavigatedFrom.

The rationale for this is that you’re going to want to do all the same work that you would have done if the user was navigating from this page to another within your app. For intra-app navigation, you’ll want to store away a record of superficial UI element state such as item selection and scrollbar positions, so that if the user returns to your page (by hitting the back button) it appears to be in the same state as before, even though it may be represented by a whole new object. You will want to do all that same work during suspension too, so that if you do get terminated, you are able to put the UI back into its previous state if the user does ever return to your app.

So the Frame class ‘helpfully’ calls OnNavigatedFrom when asked to persist its state, so that your page can save its state too. But this feels like a bit of a hack, because navigation isn’t really happening. Worse, it makes it hard to tell the difference between suspension and navigation. If you need to know which of those two operations caused the method to run, you’ll either need to modify the suspension code so that you can find out whether suspension is in progress, or you could handle the OnNavigatingFrom method. (With intra-app navigation, OnNavigatedFrom is preceded by a call to OnNavigatingFrom, but the call to Frame.GetNavigationState during suspension only causes the former to be called.) Either way, you’re relying on indirect cues.

The Visual Studio templates choose to wrap all this in a layer of abstraction, so in practice, you’ll often override not OnNavigatedFrom, but SaveState.


The LayoutAwarePage base class that Visual Studio typically adds to store application projects defines a SaveState virtual method. It calls this from its OnNavigatedFrom, passing in a dictionary to which you can add data. Because OnNavigatedFrom gets called in two scenarios—intra-app navigation and suspension—the same is true for SaveState. And the information you pass to this dictionary is also returned in two scenarios: resumption, and intra-app navigation.

However, there’s one significant scenario in which you’ll never see this data again: if the user switches away from your application for long enough that the Suspending event is raised, then you’ll see a call to SaveState (thanks to the call to OnNavigatedFrom triggered by the call to Frame.GetNavigationState made by the suspension manager supplied by the Visual Studio template). If the user then returns to your application while it was still suspended in memory, your page will not be notified, and you don’t get handed this data back.

Arguably, you don’t need the data because the state you saved is all still in memory anyway, so there’s no need to try to reconstitute it. However, the subtle point here, which may cause difficulty, is that your page gets no notification whatsoever that the user has returned. This seems a little inconsistent. You do get notifications in the following scenarios: a) the user navigates to a different page within your application and then returns; b) the user switches away from your application, it gets suspended, then Windows terminates the process to free up some memory, then the user returns to the app, so Windows has to re-launch it, at which point the suspension manager will arrange to pass you back the state you saved. It seems a little weird that both these scenarios get handed back the data, but the one where the process happened to remain in memory, suspended, does not. The rationale is presumably that you only need to restore the state if you no longer have access to the page object, which would be the case either when navigating between pages (assuming the page wasn’t cached) or when the app gets terminated. When resuming from suspension, the current page object is still there, so a call to LoadState would arguably be redundant.

So it does make sense, but the lack of symmetry here can be confusing at first.

Note that everything you put in the dictionary has to be serializable, because when SaveState is called during suspension (as opposed to intra-app navigation) the information will be written to disk. If you use anything other than primitive data types, they’ll need to be marked with the [DataContract] attribute to enable serialization.

SaveState is mostly useful for storing superficial user interface state so that it can be reconstituted if the user returns to your application after the original UI objects are no longer available, either because you’re doing intra-app navigation and the page object is no longer there, or because your app got terminated after being suspended. There’s no guarantee that you’ll ever have the data you saved returned to you, and this event typically happens several seconds after the user switched away from your app. And if the user returns to your app fast enough (i.e., before Windows manages to suspend it) then you’ll never get this event. So this is not a good place in which to bring ongoing work (e.g., a game engine, or video playback) to a halt. As already discussed, visibility changes are much better for that sort of thing.

Next Time

This has turned into a surprisingly long post, so I’ll save the second half of the story, resumption, until next time.

Blog Migrated to Azure

Friday 12 July, 2013, 02:50 PM

Over the last couple of days, I've migrated my blog from its old home - a hosted server I've been renting for almost 10 years - to an Azure web site. With luck, you won't have noticed any difference. (And since it's been a few months since I was last posting regularly, I doubt many people were checking up here every few minutes...)

The new site has been running with read-only access for a day. This is by way of a test post to verify that I'm now able to post again. I intend to post some real new posts over the next couple of weeks, but I just wanted to do this quick smoke test first.

FORTRAN-Compatible Dynamic Objects in C#

Monday 1 April, 2013, 12:46 PM

The 4th version of C# (which shipped with Visual Studio 2010) added the dynamic keyword. This is most useful for dealing with COM scripting APIs such as Microsoft Office’s Automation features, which were a nightmare to work with in older versions of C#. But this language feature also makes it possible to use certain idioms that used to be the preserve of dynamic languages. For example, it lets you write this sort of questionable code:

dynamic o = new ExpandoObject();

o.Count = "1";
o.Count += 4;


As you may or may not be expecting, this prints out 14.

C# lets the target object determine the semantics, i.e., not everything does the same thing when used through dynamic. In the example above, I’ve used ExpandoObject, which behaves in a way that vaguely resembles how objects work in JavaScript. In particular, if you set a property that didn’t previously exist, the object will automatically ‘expand’ by growing a new property of the specified name. (It’s not quite the same as JavaScript, which will even let you read a property that has never been set; you’ll get an exception if you try that with ExpandoObject in C#.)

This enables you to make mistakes like this, without any tiresome interference from the compiler:

dynamic o = new ExpandoObject();

o.Count = 1;
o.Cuont = o.Count + 1;


Despite the fun I’ve been having with Clojure lately, I remain more of a static typing guy at heart, so I admit to being a little perplexed about why dynamic language fans prefer things their way. But presumably this example is precisely the sort of thing they’re banging on about when they boast about how much more productive their preferred languages are—it lets you get bugs like this straight into production, unencumbered by a compiler that might point out the mistake.

(If, despite choosing a dynamic language, this is the sort of thing you want to avoid, I gather the usual solution is to try to catch this sort of problem with unit tests instead. It’s easier to write a test to find this sort of bug than trying to test anything deeper about what your code is supposed to do—indeed it’s so easy that outside of the dynamic world, the compiler can do it for you. I presume the increased volume of shallow tests adds to the sensation dynamic language advocates get of feeling highly productive, at least insofar as producing high volumes of code constitutes productivity. And yet they’re so rude about ‘ceremony’ in languages they don’t like!)

When reviewing the chapter on C#’s dynamic keyword in my book, Programming C# 5.0, my father remarked “Shades of FORTAN IV!!” He’s been working with computers since he left university in the early 1960s, so he can bring some useful context to supposedly new developments in technology. As he explains:

“In FORTRAN you didn’t have to declare your variables before using them. If you used a particular name, it automatically created the variable for you. (Typing was implicit too—if the first letter of the variable’s name was in the range I to N the variable was an integer, anything else was a float. Early FORTRAN was very numeric-oriented—things like characters were really just numbers being treated in a particular way.)”

The parenthetical comment interested me, because it would enable us to avoid what might be a bug in the first example above—if the author of the code had intended the count to be an integer, then the result, 14, will not be suitable. If only we had FORTRAN IV’s ability to state what you mean by simply adding an ugly prefix to a variable name, we could get the right result while still using dynamically created properties. So I wrote a simple class to implement this:

class FortranIVObject : DynamicObject
    private readonly Dictionary<string, object> _values =
        new Dictionary<string, object>();

    public override bool TrySetMember(
        SetMemberBinder binder, object value)

        char first = binder.Name[0];
        if (first >= 'I' && first <= 'M')
            value = Convert.ToInt32(value);

        _values[binder.Name] = value;

        return true;

    public override bool TryGetMember(
        GetMemberBinder binder, out object result)

        if (!_values.TryGetValue(binder.Name, out result))
            char first = binder.Name[0];
            result = first >= 'I' && first <= 'M' ? (object) 0 : 0.0;
            _values[binder.Name] = result;

        return true;

    private static void EnsureShouting(string name)
        if (name.Any(char.IsLower))
            throw new ArgumentException(
                "FORTRAN identifiers must be UPPERCASE.");

I’m deriving from DynamicObject, because that does most of the work required to implement custom dynamic behaviour. We just have to write an override for each feature we’d like to control. In this case, when a member is set, we look for the letter prefix, and if that’s one of the letters indicating that the variable should be an integer, we coerce the incoming value. Notice that the code for getting a value is happy to let you read a variable without ever having initialized it—it defaults to a value of zero. This gets even more into the spirit of dynamic languages than JavaScript—although JavaScript doesn’t report an error when reading undefined variables, it does return a special undefined value, increasing the risk that defects in your code will be detected before you ship. Conversely, my implementation supplies a value that is more likely to allow bugs to go into production unnoticed. But that’s just icing on the cake. The main point here was to get our first example to work as intended. It looks a little different:

dynamic o = new FortranIVObject();

o.ICOUNT = "1";
o.ICOUNT += 4;


Obviously, I’ve had to replace ExpandoObject with my new type. But you’ll also notice a certain amount of SHOUTING—that’s because my custom dynamic type requires identifiers to be uppercase, to make it feel more like FORTRAN. But with that and the “I” prefix in place, it works: it prints out 5, instead of 14. Rest assured that this only fixes the particular kind of bug encountered by the original snippet—we still get most of the usual danger of dynamic typing, but combined with all the convenience and aesthetic merits of Hungarian notation.

Hanging Chad Emulation

My father went on to tell me of problem he once encountered, relating to FORTRAN IV’s support for using variables without having to declare them:

“The automatic variable creation was an interesting source of errors—a colleague and I once spent a week tracking down a bug due to this. In those days all our source files were on punched cards, and in one place an incompletely punched-out chad had got folded back into place so that what had been a letter “U” became a digit “4” (thanks to how EBCDIC encoded characters on punched cards). So in one place, a variable whose name should have been “TU” became referred to as “T4”. The compiler happily created another variable called “T4” and used it in that statement. So when other parts of the program altered the value of “TU”, this particular line of code was oblivious. Unfortunately, the effect on the behaviour of the program was not obvious (it was doing speed/time calculations to simulate the dynamic performance of a numerically controlled machine tool—System 24), so the program appeared to run OK but the machined parts didn’t come out like they should! So I’m not a big fan of automatic variable creation. I thought that one of the reasons that later languages like PL/1 insisted on you declaring variables explicitly was to avoid just this sort of risk (although more likely caused by typing errors than hanging chads—reminiscent of the election recounts in Florida by which George Bush scraped into the presidency).”

Incidentally, the test/debug round trip time was pretty high, because running updated code involved putting a box of punched cards in a car and driving across London to the nearest IBM data centre. In those days, ‘cloud computing’ referred to braving London’s infamous smog to get to one of the very few places that had a computer.

Some of my readers are doubtless too young to remember that Bush election, and thus way too young to remember punched cards. They’re how we used to store data back before magnetic media became reliable and affordable. A ‘chad’ is a bit of card punched out when ‘writing’ a value by making a hole in the card. Occasionally, the machine punching out the hole didn’t quite cut through the card cleanly, leaving the chad partially attached (or ‘hanging’). Sometimes, a hanging chad would fold back and block up the hole, changing the value that the punched card reader would report next time it saw the card. This is a kind of bit rot, although it’s typically reversible: if you inspect the cards, you can see when a hanging chad has closed up. It’s harder to perform similar media integrity checks through visual inspection with today’s storage devices, sadly.

Anyway, this sort of random renaming of variables struck me as exactly the sort of thing that a dynamic language fan would probably enjoy, so I decided to add some occasional random behaviour to my dynamic object. From time to time, it will act as though the variable name supplied was different from the one the developer intended. Here’s the modified version:

class FortranIVObject : DynamicObject
    private readonly Dictionary<string, object> _values =
        new Dictionary<string, object>();

    private readonly Random _chadRand = new Random();

    public override bool TrySetMember(
        SetMemberBinder binder, object value)
        string name = EnsureShoutingAndChadify(binder.Name);

        char first = name[0];
        if (first >= 'I' && first <= 'M')
            value = Convert.ToInt32(value);

        _values[name] = value;

        return true;

    public override bool TryGetMember(
        GetMemberBinder binder, out object result)
        string name = EnsureShoutingAndChadify(binder.Name);

        if (!_values.TryGetValue(name, out result))
            char first = name[0];
            result = first >= 'I' && first <= 'M' ? (object)  0 : 0.0;
            _values[name] = result;

        return true;

    private string EnsureShoutingAndChadify(string name)
        if (name.Any(char.IsLower))
            throw new ArgumentException(
                "FORTRAN identifiers must be UPPERCASE.");

        if (_chadRand.Next(50) == 1)
            var sb = new StringBuilder(name);
            sb[_chadRand.Next(name.Length - 1) + 1] ^= (char) 1;

            return sb.ToString();

        return name;

With this modification, my earlier code continues to print 5 as expected most of the time, but once in a while it will print some other value, such as 4 or 0. You don’t get more dynamic than that.

I hope this code is exactly as useful to the community as it deserves to be, and I hope that you enjoy this first day of the month!

Update 2013-04-01 13:31 (GMT+1): fixed bug in the code that handles the I..M for get operations. (I was originally only going to handle that for set.) I put that in at the last minute, with inevitably buggy consequences - the original first code example wouldn't actually compile (because it was a mixture of the original code, and an update to the second snippet), and in both cases, it ended up with a value of 0.0f for both float and integer variables, due to a missing cast.

Copyright © 2002-2013, Interact Software Ltd. Content by Ian Griffiths. Please direct all Web site inquiries to webmaster@interact-sw.co.uk