IanG on Tap

Ian Griffiths in Weblog Form (RSS 2.0)

Blog Navigation

April (2018)

(1 item)

August (2014)

(1 item)

July (2014)

(5 items)

April (2014)

(1 item)

March (2014)

(1 item)

January (2014)

(2 items)

November (2013)

(2 items)

July (2013)

(4 items)

April (2013)

(1 item)

February (2013)

(6 items)

September (2011)

(2 items)

November (2010)

(4 items)

September (2010)

(1 item)

August (2010)

(4 items)

July (2010)

(2 items)

September (2009)

(1 item)

June (2009)

(1 item)

April (2009)

(1 item)

November (2008)

(1 item)

October (2008)

(1 item)

September (2008)

(1 item)

July (2008)

(1 item)

June (2008)

(1 item)

May (2008)

(2 items)

April (2008)

(2 items)

March (2008)

(5 items)

January (2008)

(3 items)

December (2007)

(1 item)

November (2007)

(1 item)

October (2007)

(1 item)

September (2007)

(3 items)

August (2007)

(1 item)

July (2007)

(1 item)

June (2007)

(2 items)

May (2007)

(8 items)

April (2007)

(2 items)

March (2007)

(7 items)

February (2007)

(2 items)

January (2007)

(2 items)

November (2006)

(1 item)

October (2006)

(2 items)

September (2006)

(1 item)

June (2006)

(2 items)

May (2006)

(4 items)

April (2006)

(1 item)

March (2006)

(5 items)

January (2006)

(1 item)

December (2005)

(3 items)

November (2005)

(2 items)

October (2005)

(2 items)

September (2005)

(8 items)

August (2005)

(7 items)

June (2005)

(3 items)

May (2005)

(7 items)

April (2005)

(6 items)

March (2005)

(1 item)

February (2005)

(2 items)

January (2005)

(5 items)

December (2004)

(5 items)

November (2004)

(7 items)

October (2004)

(3 items)

September (2004)

(7 items)

August (2004)

(16 items)

July (2004)

(10 items)

June (2004)

(27 items)

May (2004)

(15 items)

April (2004)

(15 items)

March (2004)

(13 items)

February (2004)

(16 items)

January (2004)

(15 items)

Blog Home

RSS 2.0

Writing

Programming C# 5.0

Programming WPF

Other Sites

Interact Software

Dare Obasanjo on C# Anonymous Types

Friday 4 January, 2008, 10:30 AM

Dare Obasanjo recently wrote a blog entry comparing some language features commonly associated with ‘dynamic’ languages with C# 3.0 equivalents. Towards the end is a section called “Python vs. C# 3.0: Tuples and Dynamic Typing vs. Anonymous Types and Type Inferencing.” In it, he complains that he ended up needing to use nominal types, instead of anonymous types. For example, he wrote:

var vote = new Vote()
{
  Weight = voteFunc(item),
  Item = item,
  FeedTitle = feedTitle
};

This required him to define the Vote class somewhere. He had been hoping to use C#’s anonymous types, which would have removed the need to define the Vote class explicitly, letting him write:

var vote =
{
  Weight = voteFunc(item),
  Item = item,
  FeedTitle = feedTitle
};

He couldn’t do this because his code structure prevented the anonymous type flowing to where he needed it. The first part of his code was a loop which added a bunch of these votes to a list which he later wanted to iterate through. The anonymous type was buried in the nested scope of the first loop, and was therefore inaccessible in the second loop. The fundamental problem here is that C# type inference for ‘var’ variables occurs at the point of declaration. The compiler isn’t prepared to wait around and see what you do with the variable – if it can’t infer the type at the point of declaration it gives up with an error.

I thought I’d take a crack at re-organizing his code so that it can use anonymous types. Note that this is not meant to be a criticism of Dare’s code. I just thought it would be interesting to see what it takes to use anonymous types here.

The Code

I’m not completely sure I’ve understood Dare’s intent with the code – in my tests my code does the same as his, but what it does doesn’t make much sense to me. The code generates a list of URLs ordered by a score. It’s the score I don’t quite understand – it seems to be calculated by finding all the feed items that have an outgoing link to the URL, finding the lowest score amongst those items per distinct feedTitle, and then adding those feedTitle-specific lowest scores together. I’m not sure why that’s useful, and in Dare’s example code, the feedTitle appears never to change. So in practice, the score for a URL is that of the lowest-scoring FeedItem that has that URL as an outgoing link.

But my goal was to replicate Dare’s logic, whether or not I understand it. So even though the sum over per-feedTitle lowest scores is redundant (it always sums over just a single score) I’ve replicated the logic nonetheless – I’m assuming that his code is a simplified version of something in which that logic matters. I’ve also kept the names of the collections the same, and I’ve preserved a couple of the comments, in the hope of making the two examples easier to compare. Here it is, rewritten so that we get to use anonymous types:

// calculate vote for each outgoing url
var all_links = from item in items
                from url in item.OutgoingLinks.Keys 
                group item by url into itemUrlGroup
                select new
                {
                  Url=itemUrlGroup.Key,
                  Votes=from item in itemUrlGroup
                        select new
                        {
                          Weight=voteFunc(item),
                          Item=item,
                          FeedTitle=feedTitle
                        }
                };

// tally the votes
var weighted_links = from link_n_votes in all_links
                     select new
                     {
                       Url=link_n_votes.Url,
                       Score=(from vote in link_n_votes.Votes
                              group vote by vote.FeedTitle into feed
                              select feed.Min(vote => vote.Weight)
                             ).Sum()
                     } into weighted_link
                     orderby weighted_link.Score descending
                     select weighted_link;

This allows us to get rid of the two nominal types in Dare’s example: Vote and RankedLink. (Types whose definitions he doesn’t show in the example, incidentally, just in case you were trying to find them. If you’re playing along at home, you need to write your own version of these classes if you want to run Dare’s code.)

[Update: Dare pointed out to me that the full code in context is at http://www.25hoursaday.com/weblog/2008/01/02/AMemetrackerInC30.aspx]

This code uses anonymous types instead, just as Dare wanted.

So C#’s static typing has not been an obstacle to using anonymous types here, but we’ve had to structure things pretty differently to get there.

(In case you’re not familiar with LINQ, don’t be misled by the slight resemblance to SQL. This code doesn’t go anywhere near a database. I’m using the C# 3.0 Query Expression Syntax with LINQ to Objects in this particular example. It’s all just code and objects. Oh, and since you’re not familiar with LINQ, I’m guessing you’re fairly new to my blog, so welcome!)

To me, the difference in approach reminds me of how it felt when I first approached SQL as a software developer: you can try to bend SQL to the iterative and imperative techniques that may feel natural as a developer, but you’ll almost certainly be fighting the system.

Just as SQL will work much better for you if you embrace its set-oriented nature, so LINQ seems to work better if you embrace its functional nature. To me, that seems to be the biggest philosophical difference between Dare’s approach and this modified version. Dare’s code (in both C# and Python) is classic procedural, imperative code – he creates mutable collections, building them up by adding one element at a time. By contrast, I’m not using mutability at all in my code – the only assignments are in initializations, and none of the function calls have side effects. (And while Dare wasn’t strictly attempting to use LINQ itself, anonymous types are part of LINQ’s world. And I think if you try to use them, you’re stepping into LINQ whether you mean to or not.)

But is it better?

I may have solved the specific problem Dare outlined, but is the result an improvement? I find it hard to say. I think the code I’ve written here is pretty hard to understand, but I also think that about the original code. That’s probably an upshot of the fact that the original was an isolated bit of experimental code. (If I’ve understood correctly, Dare has been using Python to prototype ideas prior to implementing them.) Moreover, I’m not very familiar with the domain, so the C# versions (both Dare’s and mine) and the Python version are all completely unreadable to me. So I can’t offer a meaningful opinion.

However, I’ve been using LINQ for real in a couple of projects over the last few months, so I can talk about how I feel about the general approach I’ve applied here. And so far my feelings are mixed, but leaning towards the positive. I sometimes worry that I’m twisting my code to make it fit LINQ. However, I often find that having recast my approach, I end up with code that mostly expresses what I want done, rather than expressing how I’d like to do it.

This focus on what, not how is the essence of a declarative approach. (Despite popular opinion, declarative style does not mean using markup... But that’s a theme for another blog entry.) So to put it another way, adopting a functional style for this sort of problem in C# often seems to have the happy side effect of producing a more declarative style.

In summary, although I’m still finding my feet, I’m rather coming to like LINQ.

Copyright © 2002-2013, Interact Software Ltd. Content by Ian Griffiths. Please direct all Web site inquiries to webmaster@interact-sw.co.uk