Service API Evolution

Wednesday 31 March, 2004, 08:44 PM

Successful software almost always ends up changing. It's remarkably unusual for requirements to remain static, so if your software is being used at all, it will probably need to evolve. Managing change within an isolated system can be tricky enough, but managing change when your software exposes some kind of remote service can be particularly challenging - how do you go about changing the publicly visible face of your service without breaking any existing clients?

For a while I worked in a market sector which faced a fairly extreme version of this problem. We provide a publicly accessible service which can have tens of millions of users active at any time. It has some reasonably aggressive real-time requirements: two streams of data must be supplied for each service endpoint; one of the streams needs to deliver about half a megabyte of information every 20 milliseconds (16.7ms for the US version of this service, although the volume of information is slightly lower for each unit there, so the data rate per second works out about the same); another lower bandwidth stream has data rates on the order of one megabit per second. (These are the raw data rates by the way - in some circumstances the data can be compressed, so the actual transport bit rates can be much lower.) These two data streams need to be presented to the user with very little jitter in the timing, and their presentation needs to be in sync.

As if that wasn't hard enough, we have no control over end-user systems. Once deployed, an end user system often remains in place for 20 years or more. The only way of pushing upgrades out to these end users, who are located all over the country, is to send them new hardware!

Despite these challenges, this service ('television' - you may well have heard of it) has been reasonably popular. I read recently that colour television's 50th anniversary recently passed. It reminded me that I've been meaning to write about how the television industry deals with the evolution of services that have a large and somewhat inflexible set of existing users, and draw some comparisons with web services.

One of the interesting features of television services is that they have managed to evolve quite significantly despite the challenges they face. What started out as a low resolution, monochrome, monophonic, no-frills service has gradually acquired better resolution, colour, stereo and then later surround sound, and digital information services. This is quite impressive when you consider the need to retain backwards compatibility for two decades or more.

If you want to add a new feature to a service, there are two ways you can go. You can either try and find a way of levering it into your existing service in a way that doesn't upset existing client systems, or you can introduce a brand new version of the service. Of course the problem with the latter approach is that you need all the clients to change, so unless you have a way of replacing all the clients in one fell swoop, you're probably going to have to run both services in parallel until you can get the clients moved over to the new service.

Since the 'moving clients over to a new service' approach takes the best part of twenty years in television, the 'modify the existing service in a non-breaking way' has tended to be the preferred approach.

The key to introducing new features has been the ability to put information in places that existing clients aren't looking. For example, colour was added by encoding it on a sub carrier frequency that was high enough to be invisible to black and white sets. Several other features have been introduced by transmitting data in certain scan lines of the display which, because of technical limitations in early sets, are never visible on screen.

The fact that it was possible to add new features in this way is not exactly a testament to the original design of broadcast video standards. It has only been possible due to a combination of serendipity (the fact that there were places to put data that old clients would ignore was historical accident) and the ingenuity of broadcast video engineers over the years.

In the UK, as far as I know there have only been two breaking changes. One was the transition from 405 lines to 625 lines, which was also accompanied by a shift from VHF to UHF. Public 625 line services in the UK started in 1964. (Colour was added 3 years later, but in backwards compatible way. We were rather late getting colour in the UK - the USA had it over a decade earlier.) The old 405 line services were finally switched off in 1984.

The other breaking change is the transition currently in progress from analogue to digital services. Sky, the satellite TV broadcaster in the UK, moved all of their subscribers over to their digital service a while ago. Since they're a subscription service, they knew who all their users were and were able to complete this transition reasonably quickly. A set top box had always been required to receive their service, so they just replaced all of these with digital boxes. This was expensive, but effective. And for satellite broadcasting, maintaining two services costs an awful lot, so it's easy to see why they decided to move everyone over as quickly as they did - it has saved them money in the long run.

Terrestrial TV is going to take a little longer. The BBC started trial digital broadcasts in 1996, and launched its full terrestrial digital service in autumn of 1998, so the 20 year rule of thumb suggests that we'll be ready to switch off analogue transmissions round about 2018. And yet, the government's current plan is to switch off the analogue service in the UK by 2010. However, while uptake of digital TV has been fairly high, it's a long way from being ubiquitous - even today, most new TV sets don't have integrated digital receivers, so you need a set top box to decode digital broadcasts, and about two thirds of the UK population don't have one.

In fact the Department of Trade and Industry say the government want the switchover to be "between 2006 and 2010". So it looks like it's a race between digital TV and Longhorn... But it's not clear when it will really happen - I'm guessing it won't happen in 2006 given the current state of readiness. As far as I can tell, these dates were published back in 1999 and I'm not sure when they were last revisited.

So what does this have to do with web services?

There are essentially two strategies for dealing with change in a service's API. Either you can try to change the existing service in a non-breaking way, or you can bring a new service online and leave the old one in place for a transitional period.

Broadcast television has always preferred the former approach, but it's not necessarily so appropriate for all kinds of service. The two main reasons for preferring incremental non-breaking changes in television are:

Limited radio spectrum space limits the number of services you can run side by side
Client systems remain in place for decades

Neither of these restrictions usually apply for a web service. The amount of 'spectrum' required to expose a particular service is often nothing more than a URL, and those are not in particularly short supply. And very few computer systems seem to hang around for decades. However, there are still reasons not to add new services every five minutes - each service you have to maintain has its administrative overheads.

So how do non-breaking changes fare in a web service world? Googling for web service schema evolution indicates that a lot of people are thinking quite hard about this. And in theory, we have the benefit of being able to design in extensibility from the start, rather than having to find creative ways of adding it as an afterthought.

But I worry that all the talk of various XML schema-based extension techniques is missing the main problem. Just because the consumer of an XML document agreed to be able to deal with a particular schema doesn't mean that they are actually capable of dealing with any instance of that schema. There's a good chance that if you have several clients for a service, some of them only work with the instances you happen to be handing out right now. E.g., even if your schema says 'I reserve the right to add extra elements here', you may well discover that the first time you try to exploit that flexibility, several of your clients turn out not to cope.

Experience shows that developers tend to rely on undocumented features of APIs. (Just ask Raymond Chen.) Why should we expect developers to behave any differently when using web services? If I advertise that I will generate XML instances with this schema:

<xs:complexType name="myType">
  <xs:sequence>
    <xs:element name="foo" type="xs:string" />
    <xs:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
    <xs:element name="bar" type="xs:string" minOccurs="0"/>
  </xs:sequence>
</xs:complexType>

but the current version of my service happens never to put anything in between the foo and bar elements, i.e. the instances are always something like:

<stuff>
  <foo>Hello</foo>
  <bar>World</bar>
</stuff>

the chances are that some client somewhere was lazy and their code just assumes that bar follows foo just as night follows day. This code will fail if we modify our service to start adding elements in the 'twilight' zone between the foo and bar, despite the fact that we carefully allowed for this in our original schema design. When this happens, you will of course be entitled to feel righteously indignant, and you can point at the schema, whilst shaking your fist at the sky and cursing the relevant client. While this might be a cathartic exercise, it probably won't fix the problem.

In short, just because we can define a service API in a way that specifies where future evolution is allowed doesn't mean all clients will honour that potential for evolution.

So it seems like television actually had a distinct advantage here - most of the extensibility points at which new functionality was added took advantage of the fact that old clients couldn't even see the new data. Newer technology was able to move into the gaps that the old technology was unable to exploit. But with XML, there isn't a way of sending data that will be invisible to old clients. (Unless you do something horrible like putting new data in a comment... Mind you, that worked for the early days of client-side scripting in HTML.)

This leads me to think that 'non-breaking changes' to the structure of messages originating from a service may be a pipedream - any change will probably break something. For messages consumed by a service it's different - as the service author you can know the full history of what constituted valid messages as the service evolved. And there you don't need any fancy techniques for designing your schemas to support evolution. If you can design a new schema which accepts both old and new messages, then you're done.

So perhaps the answer is that for the messages a server accepts, we can use the television approach of gradual changes, each new version extending the previous one, but for messages a server creates, modifying an established message structure is likely to upset existing clients.

(If you also control the clients, then life is much better. You can make sure your clients are robust in the face of change before deploying them. More importantly, you can test any planned changes to the server against all the clients. So if you own both ends, you really do have a good chance of evolving both ends of the connection relatively easily. And if you don't control the clients, but you can compile a comprehensive list of clients for test purposes, then you have some chance of evolving an existing service.)

Why not make the server adapt to the client?

Certain kinds of web services have an advantage over television: they may be able to know things about their client. If a service is generating a message as a response to some incoming request, it might be able to infer something about the client. For example, if the incoming request asks for a feature that was only introduced in a recent version of the service, you know that the client must have been written fairly recently. So you might be able to use a model whereby the request contains whatever the client asked for and nothing more - that allows you to extend the set of features a client can ask for over time. If you only return what you were asked to return, you are unlikely to upset a client by providing it with data it wasn't expecting to see.

But even that isn't without its problems. Adapting to particular web service client versions isn't so different from the detecting the browser type in a web site and modifying the generated HTML to suit the browser. It's technically possible, and some sites do it, but it's a maintenance headache in practice.

So my current thinking is that it's often best to maintain multiple versions in parallel. And if you've implemented your service endpoints as a facade on top of the underlying service implementation, your 'legacy' support is fairly well isolated from the rest of the system.

April (2018)	(1 item)
August (2014)	(1 item)
July (2014)	(5 items)
April (2014)	(1 item)
March (2014)	(1 item)
January (2014)	(2 items)
November (2013)	(2 items)
July (2013)	(4 items)
April (2013)	(1 item)
February (2013)	(6 items)
September (2011)	(2 items)
November (2010)	(4 items)
September (2010)	(1 item)
August (2010)	(4 items)
July (2010)	(2 items)
September (2009)	(1 item)
June (2009)	(1 item)
April (2009)	(1 item)
November (2008)	(1 item)
October (2008)	(1 item)
September (2008)	(1 item)
July (2008)	(1 item)
June (2008)	(1 item)
May (2008)	(2 items)
April (2008)	(2 items)
March (2008)	(5 items)
January (2008)	(3 items)
December (2007)	(1 item)
November (2007)	(1 item)
October (2007)	(1 item)
September (2007)	(3 items)
August (2007)	(1 item)
July (2007)	(1 item)
June (2007)	(2 items)
May (2007)	(8 items)
April (2007)	(2 items)
March (2007)	(7 items)
February (2007)	(2 items)
January (2007)	(2 items)
November (2006)	(1 item)
October (2006)	(2 items)
September (2006)	(1 item)
June (2006)	(2 items)
May (2006)	(4 items)
April (2006)	(1 item)
March (2006)	(5 items)
January (2006)	(1 item)
December (2005)	(3 items)
November (2005)	(2 items)
October (2005)	(2 items)
September (2005)	(8 items)
August (2005)	(7 items)
June (2005)	(3 items)
May (2005)	(7 items)
April (2005)	(6 items)
March (2005)	(1 item)
February (2005)	(2 items)
January (2005)	(5 items)
December (2004)	(5 items)
November (2004)	(7 items)
October (2004)	(3 items)
September (2004)	(7 items)
August (2004)	(16 items)
July (2004)	(10 items)
June (2004)	(27 items)
May (2004)	(15 items)
April (2004)	(15 items)
March (2004)	(13 items)
February (2004)	(16 items)
January (2004)	(15 items)

IanG on Tap

Blog Navigation

Writing

Other Sites