IanG on Tap

Ian Griffiths in Weblog Form (RSS 2.0)

Blog Navigation

July (2014)

(5 items)

April (2014)

(1 item)

March (2014)

(1 item)

January (2014)

(2 items)

November (2013)

(2 items)

July (2013)

(4 items)

April (2013)

(1 item)

February (2013)

(6 items)

September (2011)

(2 items)

November (2010)

(4 items)

September (2010)

(1 item)

August (2010)

(4 items)

July (2010)

(2 items)

September (2009)

(1 item)

June (2009)

(1 item)

April (2009)

(1 item)

November (2008)

(1 item)

October (2008)

(1 item)

September (2008)

(1 item)

July (2008)

(1 item)

June (2008)

(1 item)

May (2008)

(2 items)

April (2008)

(2 items)

March (2008)

(5 items)

January (2008)

(3 items)

December (2007)

(1 item)

November (2007)

(1 item)

October (2007)

(1 item)

September (2007)

(3 items)

August (2007)

(1 item)

July (2007)

(1 item)

June (2007)

(2 items)

May (2007)

(8 items)

April (2007)

(2 items)

March (2007)

(7 items)

February (2007)

(2 items)

January (2007)

(2 items)

November (2006)

(1 item)

October (2006)

(2 items)

September (2006)

(1 item)

June (2006)

(2 items)

May (2006)

(4 items)

April (2006)

(1 item)

March (2006)

(5 items)

January (2006)

(1 item)

December (2005)

(3 items)

November (2005)

(2 items)

October (2005)

(2 items)

September (2005)

(8 items)

August (2005)

(7 items)

June (2005)

(3 items)

May (2005)

(7 items)

April (2005)

(6 items)

March (2005)

(1 item)

February (2005)

(2 items)

January (2005)

(5 items)

December (2004)

(5 items)

November (2004)

(7 items)

October (2004)

(3 items)

September (2004)

(7 items)

August (2004)

(16 items)

July (2004)

(10 items)

June (2004)

(27 items)

May (2004)

(15 items)

April (2004)

(15 items)

March (2004)

(13 items)

February (2004)

(16 items)

January (2004)

(15 items)

Blog Home

RSS 2.0

Writing

Programming C# 5.0

Programming WPF

Other Sites

Interact Software

Tweaking Async IO in WPF

Friday 15 February, 2013, 05:19 PM

In my previous blog, I compared synchronous and asynchronous methods for reading data out of some web server log files and showing them in a WPF user interface. The first synchronous version took 2.6 seconds to process about 180,000 lines, and the naive asynchronous equivalent took 8.5 seconds. With a subtle change of order, I was able to get these figures down to 1 second for the synchronous code, and 2.8 seconds for the asynchronous code. The trick was to prevent WPF’s data binding system from seeing the data until we had finished loading all of it.

I said last time that this is not a fair comparison. This test flatters the synchronous code—it takes 1 second to load the data, but at that point, no data binding work has occurred at all, so there will be more processing required before anything appears on screen. (With the asynchronous version, some of that work is already done by the time the load completes.) Unfortunately, it’s not easy to measure how long that deferred work takes. Even so, measuring manually with a stopwatch, the synchronous version doesn’t seem to be taking significantly longer than a second. (In any case, trying to show 180,000 lines in a ListBox is pretty bad UI design. In practice you’d want to do something to reduce that, so in a better-designed front end, the binding overhead would likely be lower anyway.) So the numbers here are in the right ballpark—it’s a little unfair, but only a little. The asynchronous version is almost three times slower in this example.

Fortunately, we can improve this. We just need to understand why asynchrony is costing us so much here.

Any time you use await with an asynchronous operation that cannot complete immediately, the containing async method will relinquish the thread. (That’s the whole point of using async and await.) When the operation completes, the method needs to regain control of the thread so that it can continue, and this is where we’re paying a heavy price. In my example, the asynchronous method runs on a WPF dispatcher thread, so regaining control will involve posting a message that the WPF message pump will handle. (In other words, every time the await keyword is made to wait, we incur the same overhead as the SynchronizationContext class’s Post method would.)

Remember, I’m processing about 180,000 lines of text here, and the code executes an await expression for each of those. It turns out that roughly 40,000 of those do not complete immediately, so the main reason our fastest asynchronous code takes about 1.8 seconds longer than the fastest synchronous code is that it’s posting 40,000 messages through WPF’s dispatcher loop.

(As I mentioned last time, I’m running these tests on a pretty old system. I built my current desktop in 2008. You would probably see faster results on something newer.)

The proportion of calls to ReadLineAsync that cannot complete immediately is a function of the average line length and the size of the internal buffers used by StreamReader. I can’t change what’s in the files, so the way to reduce this overhead is to use a larger buffer. Here’s a change to the line that creates the StreamReader:

using (var reader = new StreamReader(logfile, Encoding.UTF8, true, 65536))

That gets the number of non-immediate completions from about 40,000 down to roughly 600. This reduces the execution time from 2.8 to 1.2 seconds. And if we apply this tweak to the naive asynchronous version, it comes down from 8.5 to 3.2 seconds. (This is because we’re now only asking WPF to process 600 batches of about 300 lines each instead of 40,000 batches of four or five lines each. So the data binding overhead, while still high, is now a lot better.) This means we can get the benefits offered by the original naive version—seeing useful data immediately and remaining responsive—while only have to wait for 3.2 seconds, just a little longer than our original synchronous attempt.

We could speed things up further still by reading the entire file in one go. This would minimize the IO overheads, but it would introduce two problems. First, it would increase memory usage, and if you’re working with very large log files, that might be a problem. Second, it would prevent us from showing any data until we have all the data, so we wouldn’t be able to provide immediate results, either with synchronous or asynchronous code.

In case you’re wondering, the synchronous version also benefits from this larger buffer—it comes down to 0.8 seconds. But the gap is narrowing—at 1.2 seconds, the asynchronous version is only giving away 0.4 seconds, which is not a massive price to pay for remaining responsive. Nonetheless, it’s still taking 1.5 times as long, which might offend you. (And if you were processing larger volumes of data—400MB of log files rather than 40MB, for example—the extra delay might be more significant.) So you might wonder whether it would be better to attempt to remove the WPF dispatcher from the picture entirely. One way to do this would be to use multithreading instead of an asynchronous method. I’ll show how that works out next time.

Copyright © 2002-2013, Interact Software Ltd. Content by Ian Griffiths. Please direct all Web site inquiries to webmaster@interact-sw.co.uk