IanG on Tap

Ian Griffiths in Weblog Form (RSS 2.0)

Blog Navigation

August (2014)

(1 item)

July (2014)

(5 items)

April (2014)

(1 item)

March (2014)

(1 item)

January (2014)

(2 items)

November (2013)

(2 items)

July (2013)

(4 items)

April (2013)

(1 item)

February (2013)

(6 items)

September (2011)

(2 items)

November (2010)

(4 items)

September (2010)

(1 item)

August (2010)

(4 items)

July (2010)

(2 items)

September (2009)

(1 item)

June (2009)

(1 item)

April (2009)

(1 item)

November (2008)

(1 item)

October (2008)

(1 item)

September (2008)

(1 item)

July (2008)

(1 item)

June (2008)

(1 item)

May (2008)

(2 items)

April (2008)

(2 items)

March (2008)

(5 items)

January (2008)

(3 items)

December (2007)

(1 item)

November (2007)

(1 item)

October (2007)

(1 item)

September (2007)

(3 items)

August (2007)

(1 item)

July (2007)

(1 item)

June (2007)

(2 items)

May (2007)

(8 items)

April (2007)

(2 items)

March (2007)

(7 items)

February (2007)

(2 items)

January (2007)

(2 items)

November (2006)

(1 item)

October (2006)

(2 items)

September (2006)

(1 item)

June (2006)

(2 items)

May (2006)

(4 items)

April (2006)

(1 item)

March (2006)

(5 items)

January (2006)

(1 item)

December (2005)

(3 items)

November (2005)

(2 items)

October (2005)

(2 items)

September (2005)

(8 items)

August (2005)

(7 items)

June (2005)

(3 items)

May (2005)

(7 items)

April (2005)

(6 items)

March (2005)

(1 item)

February (2005)

(2 items)

January (2005)

(5 items)

December (2004)

(5 items)

November (2004)

(7 items)

October (2004)

(3 items)

September (2004)

(7 items)

August (2004)

(16 items)

July (2004)

(10 items)

June (2004)

(27 items)

May (2004)

(15 items)

April (2004)

(15 items)

March (2004)

(13 items)

February (2004)

(16 items)

January (2004)

(15 items)

Blog Home

RSS 2.0

Writing

Programming C# 5.0

Programming WPF

Other Sites

Interact Software

Locking, Anonymous Delegates, and Interrupts

Tuesday 17 August, 2004, 09:53 AM

Jason Whittington recently posted a code sample illustrating an alternative way of managing access to data from multiple threads. While it is based around standard .NET locking primitives under the covers, it uses a rather different programming style.

Rather than the usual aqcuire lock...do work...release lock in finally block approach, he uses a delegate based approach with a callback - something like:

sharedData.DoModify(delegate
{
    sharedData.X++;
}

His code is just a sketch rather than a complete example. (He points out a few of the problems himself in the comments.) But it illustrates the technique, and frankly, it looks a bit odd. However, I found it interesting because this is very reminiscent of the way that Windows kernel mode device drivers access data structures that may also be accessed by a device's interrupt service routine. While most locking in kernel mode is done with the usual acquire, do, release pattern, things get complicated when interrupt service routines enter the picture, especially on multiprocessor systems. Indeed kernel mode did at one stage appear to have an answer to the question someone posted in Jason's comments:

"Whats wrong with the 'using' keyword and a stack based (struct) solution."

The using statement approach is just a variant on the linear acquire-do-release approach, one I have infamously written about before. So what's wrong with it? Well life is more complex in kernel mode...

(Note for those unfamiliar with interrupts. The various bits of hardware in your computer sometimes need to get the OS's attention. Network cards indicate when incoming data arrives and when they have finished sending outgoing data, the keyboard indicates when keys have been pressed, the system timer raises regular notifications etc. These devices all notify the OS by raising an 'interrupt', which is a special kind of electronic signal (some of the pins on a PCI connector are dedicated to this purpose). All of the interrupt signals from the various peripherals on the computer are wired into a device called the 'interrupt controller', and this is directly connected to the CPU. When a device raises an interrupt, then unless the interrupt controller has been told to ignore that device, it raises an electronic signal to the CPU to tell it about the interrupt. This causes the CPU to stop whatever it is doing, and call the operating system's interrupt handler - a special piece of kernel mode code that runs when an interrupt occurs. (Actually, depending on the kind of CPU and the type of interrupt controller, the interrupt controller might be able to get it to run different routines for different devices. However, as far as I'm aware Windows does more or less the same thing for all interrupts.) The OS will then interrogate the interrupt controller to find out which device caused the interrupt. It then calls a special routine supplied by the relevant device driver called an Interrupt Service Routine (ISR). The job of this code is to acknowledge the device, thus causing it to stop raising the interrupt. The ISR might also do any other work required to handle the notification from the device, but usually the driver simply schedules some work to run at a lower priority to complete the handling later - ISRs are supposed to run quickly so they usually do as little work as possible. Multi-CPU systems are slightly more complex, in that they have an interrupt controller that decides which of the various CPUs it will send the interrupt to, but otherwise it looks much the same.)

The big challenge with interrupt service routines is that they run asynchronously. And I don't mean the warm gentle fluffy kind of asynchronous that you may be used to with the BeginInvoke/EndInvoke style of programming used by various bits of the .NET framework. No, this is the hardcore kind, as in "this piece of code could run at any moment on any thread." Interrupt service routines have to run really, really quickly - if they don't deal with the hardware that raised the interrupt promptly, everything else in the system may grind to a halt for a bit. We certainly don't have time to wait for the OS scheduler to kick in - we might be waiting for whole milliseconds. In any case, the scheduler typically won't be able to run until the interrupt has been handled, and nor will anything else apart from interrupt code for higher priority interrupts, so it's important for the ISR to run as soon as possible.

Because of this, interrupts don't run on any particular thread - they run on whichever thread happened to be active when the interrupt occurred. (This means that if your process happens to be running when an interrupt is fielded, the CPU time spent handling the interrupt comes out of your thread's quantum.) And unless interrupts have been disabled temporarily, the CPU will handle any interrupt the instant the interrupt controller raises the relevant signal. This means that your program could be interrupted at pretty much any point at all - the only guarantee of synchronization you get is that the CPU won't leave any instructions half-executed - it gets to the end of whatever instruction it is on now. (Actually with modern CPUs it is a bit more complex than that because your average CPU is usually executing several instructions at any given instant. But it will pick a point in the instruction stream, and finish all the instructions before that point. The CPU endeavours to make it look like it was executing your instructions sequentially whenever an interrupt occurs...)

The onus is entirely on the interrupt handling code to run in a way that does not disrupt the active thread. The running thread shouldn't really be able to tell that an interrupt occurred. (It might be able to see the side effects - if it is looking at the CPU timestamp, it'll see that a lot of time elapsed... And of course it will be able to see that IO operations have completed.) The main way of achieving this is for the ISR to keep itself to itself. The OS does the tricky work of preserving the CPU's register values (apart from the floating point ones - you aren't allowed to do floating point arithmetic in an ISR) so that the main thread doesn't find its world changes all of a sudden. So if your ISR makes sure it only uses local (stack-based) variables, then it won't interfere with any other code. But how is the ISR supposed to have any useful effect on anything else if it can't touch state outside of its locals?

In practice, there is usually some data shared between the ISR and the rest of the device driver. How on earth are we to synchronize access to this? We can't just use a mutex or critical section - those are scheduler primitives, and the ISR transcends the scheduler. That's OK - the kernel has special locking primitives called spinlocks which don't get the scheduler involved. But these turn out not to help in this particular scenario - consider what happens if the ISR happens to run on a thread that is modifying the shared data:

// Psuedo-code, obviously - we can't
// actually write drivers in C#... (today)
spinlock(driverData)
{
    int total = driverData.count;
    // What if an interrupt occurs here?
    total += 1;
    driverData.count = total;
}

(And don't think you can avoid the problem by rewriting this as driverData.count++ - it'll compile down to the same thing, and not all processors can execute that as a single instruction.) If an interrupt occurs at the line indicated, then if the ISR attempts to acquire the driver spinlock, we have a problem - this thread already holds the spinlock and won't let it go until it is able to proceed. But of course it cannot proceed until the ISR is finished.

In the old days, the standard fix for this was to disable interrupts before doing anything to data structures that would be accessible to the ISR. This is pretty straightforward on most CPUs - for example, the x86 has a single instruction for disabling interrupts. (You can only use it in kernel mode of course.) That way, you can be sure that the ISR won't be able to run while you use the shared data. If an interrupt happens to be raised while interrupts are disabled, the CPU will handle the interrupt as soon as you reenable interrupts.

The problem with that is that it doesn't work on multi-CPU systems. The instruction for disabling interrupts only has any effect on the CPU you execute it on - other CPUs in the system are still able to handle interrupts. So if the interrupt controller happens to decide that a different CPU will field the interrupt, the fact that you disabled interrupts on the CPU you're running on doesn't help - the ISR will still be able to run on the other processor.

So what you need is a combination of a spinlock and disabling interrupts. Which is more or less how it works internally. The OS maintains a private spinlock in some undocumented place for each interrupt service routine. During normal handling of interrupts, the OS acquires this spinlock before running the ISR.

How does this help us? Well, if we want to write some code in our device driver that doesn't live in the ISR, but which needs to use data that is also used by the ISR, we can take advantage of the same service. There is a kernel API called KeSynchronizeExecution. This takes a function pointer, and an interrupt object. It says to the OS "Please call this function in the same way you call my ISR". So the OS will disable interrupts, acquire the ISR spinlock, and then call you back. (Strictly speaking, it disables handling of interrupts with the same or lower priority than your device. This is actually slightly more involved than the one-instruction approach of disabling all interrupts, but is necessary if you want to support prioritisation of interrupts.)

In other words it looks a lot like Jason's example.

However, a lot of people in the device driver writing industry asked the same question that was asked in Jason's comments: why can't we just have straightforward acquire, do, release semantics? The main reason you couldn't do this was simply that you couldn't get hold of the spinlock directly. For a long time, Microsoft resisted this. I think the response wasn't all that far from "there is no spinlock, ISR synchronization just works by magic". And there may well have been good reasons for keeping it as magic - back when Windows NT first shipped, multi-CPU machines were exotic beasts that all shipped with their own HAL (Hardware Abstraction Layer). It's not out of the question that the synchronization of interrupts might have been implemented entirely differently on some of these machines - there might genuinely not have been an ISR spinlock on all architectures.

However, Microsoft eventually acquiesced to the developers' requests. In Windows XP, they introduced a new pair of APIs: KeAcquireInterruptSpinLock and KeReleaseInterruptSpinLock. These do the relevant masking and unmasking of interrupts and also acquire the relevant spinlock for you. This means that even in kernel mode, you can now use the acquire, release, do approach, even for data shared between your ISR and lower priority code.

So if even kernel mode has abandoned the 'call me back when you have the lock' approach, does this mean that Jason's example has no value? Actually no, there is an important class of problems where this approach is necessary. Sometimes there really is no spinlock. But this entry is quite long enough now, so I'll cover that in the next instalment.

Copyright © 2002-2013, Interact Software Ltd. Content by Ian Griffiths. Please direct all Web site inquiries to webmaster@interact-sw.co.uk