Batch Updates with INotifyCollectionChanged

Friday 22 February, 2013, 06:00 PM

Last time, I promised the conclusion to my series exploring the performance of asynchronous and multithreaded techniques for loading and displaying moderately large volumes of data in WPF applications. However, you’ll have to wait a little longer, because this is a last minute bonus entry to the series.

I received email from a couple of people making the same suggestion. (One came from Samuel Jack. And the other guy chose to remain anonymous, and haven’t heard back yet.) The suggestion was that I might be able to improve performance further with a custom implementation of INotifyCollectionChanged, the interface that ObservableCollection<T> implements to notify WPF data binding when items are added, removed, or otherwise changed.

Recall that in the previous blog entry, I pondered how fast we could possibly go, with a view to working out whether it was worth expending any more effort. I attempted to measure the minimum time required to load 180,000 items into a list box given the following constraints: items must appear as soon as some are available, and that the UI must remain responsive throughout. I concluded that on my system the best possible time was 2.2 seconds. Since my implementation with real data was taking 2.3 seconds, it didn’t seem to be worth looking to go any faster.

However, my conclusion relied on a couple of implicit assumptions. First, I was assuming that we’re using a databound ItemsControl to display the data. Obviously, it would be possible to make things faster by building something completely bespoke, but at that point you’re throwing out a large part of the benefit WPF has to offer. Second, and more subtly, I was assuming that we would be using ObservableCollection<T> as our data source. To be honest, I didn’t even give this much thought, because .NET’s collection classes are pretty good, and there’s usually not much to be gained from trying to roll your own.

As it turns out, I was right—ultimately we can’t gain much with a custom list. But I have to admit that I was right by accident, because I hadn’t originally explored this option. And it’s interesting to try this approach, because at first glance it presents a tantalising opportunity for improvement. Unfortunately, there’s a problem: although a custom collection can deliver much faster performance, there are issues with the resulting user experience.

Batching Collection Change Notifications

What might we gain with a custom collection implementation? The most obvious missing feature in ObservableCollection<T> is support for any kind of batch update. It does not offer an equivalent to the List<T> class’s AddRange method, nor does it offer any sort of BeginUpdate/EndUpdate idiom of the kind available with the classic Win32 list controls. If a custom list could provide this while still being able to deliver change notifications, perhaps we could reduce the overhead.

Here’s my attempt at an observable collection that supports batch updates:

public class BatchingObservableCollection<T> : ObservableCollection<T>
{
    private bool _inBatchUpdate;

    public IDisposable BatchUpdate()
    {
        if (_inBatchUpdate)
        {
            throw new InvalidOperationException("Batch update already in progress");
        }
        _inBatchUpdate = true;
        return new UpdateDisposable(this);
    }

    protected override void OnCollectionChanged(NotifyCollectionChangedEventArgs e)
    {
        if (!_inBatchUpdate)
        {
            base.OnCollectionChanged(e);
        }
    }

    protected override void OnPropertyChanged(PropertyChangedEventArgs e)
    {
        if (!_inBatchUpdate)
        {
            base.OnPropertyChanged(e);
        }
    }

    private void EndBatch()
    {
        if (_inBatchUpdate)
        {
            _inBatchUpdate = false;
            OnPropertyChanged(new PropertyChangedEventArgs("Count"));
            OnPropertyChanged(new PropertyChangedEventArgs("Item[]"));
            OnCollectionChanged(new NotifyCollectionChangedEventArgs(
              NotifyCollectionChangedAction.Reset));
        }
    }

    private class UpdateDisposable : IDisposable
    {
        private readonly BatchingObservableCollection<T> _parent;

        public UpdateDisposable(BatchingObservableCollection<T> parent)
        {
            _parent = parent;
        }

        public void Dispose()
        {
            _parent.EndBatch();
        }
    }
}

This provides a BatchUpdate method. It returns an IDisposable, so the idea is that you’d write a using block. (That’s why the EndBatch is private—it gets called when you Dispose the object returned by BatchUpdate.) If code inside the using block modifies the collection, it won’t raise any change notifications. At the end of the block, Dispose will be called, at which point the collection will re-enable the ability to raise events for future changes. It also has to raise some events immediately so that data binding will know that the list has changed. My code performs some property change notifications for correctness, but those aren’t the main point of interest here—the key is the collection change notification of type Reset.

(Samuel Jack’s implementation was slightly different. He implemented an AddRange method which bypassed the public Add method, adding the items directly to the underlying collection. But it ended up raising exactly the same events at the end to notify data binding of the changes.)

This use of the Reset action is at the heart of why this technique is problematic. The INotifyCollectionChange interface does have some support for reporting bulk changes. For example, you can provide an Add notification for a range of items. Unfortunately, if you try this from a collection bound to a WPF items control, WPF throws an exception back at you, complaining that it doesn’t support range actions. The only bulk operation it understands is a Reset, which is supposed to signify that the entire collection just changed. That’s definitely overkill here, but WPF seems to force us to use either that or per-item Add events. (And per-item events are precisely what we’re trying to avoid, making Reset our only option.)

Update Performance

With this custom collection in place, I first tried just adding faked items to get an idea of the best possible raw performance. Remember, when I performed a similar test last time with an ObservableCollection<T>, adding 180,000 items takes 1.5 seconds on my system, and if we yield the dispatcher thread just often enough to remain responsive, that goes up to 2.2 seconds. With this new collection it took just 0.4 seconds, a considerable improvement. And at first glance, it appears to work—it displays the items correctly, starts showing them immediately, and remains responsive.

So then I tried loading real data. I used the Rx-based chunking approach shown last time with my custom collection class. This took 1 second. Remember that the best case for simply loading the data off disk and processing it (without displaying anything) was 0.8 seconds. And now we’re able to display some data immediately, remaining responsive while the rest of the data loads, and the entire process is only 0.2 seconds slower than the raw work of reading and parsing the data. And this is well over twice as fast as the previous solution.

This would be excellent were it not for the problems it causes.

UI Problems

The downside with this technique is that it makes certain aspects of the UI flaky. It’s not as bad as I had initially thought it would be—I thought the Reset change notification might cause WPF to forget which item was selected, and to lose track of the list’s scroll position. (Repopulating a list from scratch will normally do that.) In fact, it was smart enough to keep track of both things. But there were more subtle problems.

I noticed that if I clicked on elements in the list while it was being updated, sometimes the click would have no effect. My hypothesis was that WPF was destroying and recreating the list elements each time the data source reported a Reset. To test this, I wrote a user control which picks a random background colour when constructed, and used this in the data template for my list box. And sure enough, all the list box items continuously changed colour for as long as my program updated the list, confirming my suspicion that WPF was rebuilding all the visible elements each time the list raised a Reset notification.

Without the background colour hack, you wouldn’t normally see that anything was changing—each time the list raised a Reset change notification, WPF would tear down all the list box items and load new ones, but because they were all representing the same underlying data items as before, and because it carefully preserved the selection index and scroll offset, the new items looked identical to the ones they replaced. And because WPF is careful about how and when it updates the screen you don’t see any flickering—the new items seamlessly replace the old ones, and since they will normally look identical, you don’t see it happening. Normally, the only evidence is when they start behaving oddly when you interact with them.

Once we know this is happening, it’s not surprising that clicks are going missing. If you happen to click on an item just as it’s about to be replaced with a new, identical-looking one, that click never gets handled. It was destined for an item that was removed before it had a chance to handle the input, and there’s no mechanism for reassigning that mouse input to the replacement item.

In my tests, the list box items have been just plain text, but in real applications, I often put more complex content in items controls. Clearly, if items had any children that could independently receive the focus, this continuous re-building of the items would mess that up, repeatedly resetting the focus. (And in case you’re wondering, enabling container recycling on the list control does not appear to help. With my background colour hack, I still see the colours changing with recycling enabled. And that’s to be expected—reusing the containing ListBoxItem seems unlikely to fix anything when the actual content hosted by that container is repeatedly being destroyed and recreated.)

So in although a custom list class with batch change support looks good at a first glance, but disappoints in the long run. It offers great performance, but the fact is that it causes problems when the user tries to interact with the list items, which rather reduces the value of remaining responsive to user input while the list updates. So in practice, the technique shown last time continues to be our best bet.

April (2018)	(1 item)
August (2014)	(1 item)
July (2014)	(5 items)
April (2014)	(1 item)
March (2014)	(1 item)
January (2014)	(2 items)
November (2013)	(2 items)
July (2013)	(4 items)
April (2013)	(1 item)
February (2013)	(6 items)
September (2011)	(2 items)
November (2010)	(4 items)
September (2010)	(1 item)
August (2010)	(4 items)
July (2010)	(2 items)
September (2009)	(1 item)
June (2009)	(1 item)
April (2009)	(1 item)
November (2008)	(1 item)
October (2008)	(1 item)
September (2008)	(1 item)
July (2008)	(1 item)
June (2008)	(1 item)
May (2008)	(2 items)
April (2008)	(2 items)
March (2008)	(5 items)
January (2008)	(3 items)
December (2007)	(1 item)
November (2007)	(1 item)
October (2007)	(1 item)
September (2007)	(3 items)
August (2007)	(1 item)
July (2007)	(1 item)
June (2007)	(2 items)
May (2007)	(8 items)
April (2007)	(2 items)
March (2007)	(7 items)
February (2007)	(2 items)
January (2007)	(2 items)
November (2006)	(1 item)
October (2006)	(2 items)
September (2006)	(1 item)
June (2006)	(2 items)
May (2006)	(4 items)
April (2006)	(1 item)
March (2006)	(5 items)
January (2006)	(1 item)
December (2005)	(3 items)
November (2005)	(2 items)
October (2005)	(2 items)
September (2005)	(8 items)
August (2005)	(7 items)
June (2005)	(3 items)
May (2005)	(7 items)
April (2005)	(6 items)
March (2005)	(1 item)
February (2005)	(2 items)
January (2005)	(5 items)
December (2004)	(5 items)
November (2004)	(7 items)
October (2004)	(3 items)
September (2004)	(7 items)
August (2004)	(16 items)
July (2004)	(10 items)
June (2004)	(27 items)
May (2004)	(15 items)
April (2004)	(15 items)
March (2004)	(13 items)
February (2004)	(16 items)
January (2004)	(15 items)

IanG on Tap

Blog Navigation

Writing

Other Sites