IanG on Tap

Ian Griffiths in Weblog Form (RSS 2.0)

Blog Navigation

April (2018)

(1 item)

August (2014)

(1 item)

July (2014)

(5 items)

April (2014)

(1 item)

March (2014)

(1 item)

January (2014)

(2 items)

November (2013)

(2 items)

July (2013)

(4 items)

April (2013)

(1 item)

February (2013)

(6 items)

September (2011)

(2 items)

November (2010)

(4 items)

September (2010)

(1 item)

August (2010)

(4 items)

July (2010)

(2 items)

September (2009)

(1 item)

June (2009)

(1 item)

April (2009)

(1 item)

November (2008)

(1 item)

October (2008)

(1 item)

September (2008)

(1 item)

July (2008)

(1 item)

June (2008)

(1 item)

May (2008)

(2 items)

April (2008)

(2 items)

March (2008)

(5 items)

January (2008)

(3 items)

December (2007)

(1 item)

November (2007)

(1 item)

October (2007)

(1 item)

September (2007)

(3 items)

August (2007)

(1 item)

July (2007)

(1 item)

June (2007)

(2 items)

May (2007)

(8 items)

April (2007)

(2 items)

March (2007)

(7 items)

February (2007)

(2 items)

January (2007)

(2 items)

November (2006)

(1 item)

October (2006)

(2 items)

September (2006)

(1 item)

June (2006)

(2 items)

May (2006)

(4 items)

April (2006)

(1 item)

March (2006)

(5 items)

January (2006)

(1 item)

December (2005)

(3 items)

November (2005)

(2 items)

October (2005)

(2 items)

September (2005)

(8 items)

August (2005)

(7 items)

June (2005)

(3 items)

May (2005)

(7 items)

April (2005)

(6 items)

March (2005)

(1 item)

February (2005)

(2 items)

January (2005)

(5 items)

December (2004)

(5 items)

November (2004)

(7 items)

October (2004)

(3 items)

September (2004)

(7 items)

August (2004)

(16 items)

July (2004)

(10 items)

June (2004)

(27 items)

May (2004)

(15 items)

April (2004)

(15 items)

March (2004)

(13 items)

February (2004)

(16 items)

January (2004)

(15 items)

Blog Home

RSS 2.0

Writing

Programming C# 5.0

Programming WPF

Other Sites

Interact Software

F#, C# 3.0, and Generic Functions

Monday 3 October, 2005, 03:14 AM

After my previous post on this topic, and reading the comments in Jon's generics post, I decided I had to bite the bullet and just install F#.

F# comes from Microsoft Research. It's essentially a .NET variant of ML. I think it's particularly interesting to anyone trying to grok the new C# 3.0 features. Even though ML and C# are radically different languages, they share some goals. In particular, they are both statically typed, and yet can offer many of the advantages that some people think are unique to scripting languages.

In the comments on Jon's blog, Joe Duffy points out that even though the CLI doesn't offer the feature Jon wants directly, it does enable it. A language can offer what appears to be a generic variable with uncommited type parameters even though the CLI doesn't do that. He points out that you simply have to make sure the variable is enclosed in some context that does let you define something without committing to the type parameters - either a generic method or a generic type.

But having shown some possible strategies for implementing this, he raises a point which had been on my mind:

"In fact, I'm not sure what F# does"

So I installed F#. I had been putting it off, because one of my many projects involves writing a .NET compiler for a language whose native behaviour doesn't look like a great fit for the CLI. (I'm particularly interested in what happens at the component boundary where you might have to meet other languages.) I was hoping to avoid looking too closely at anyone else's solution before I came up with my own ideas. But this C# 3.0 discussion seems to have got to the point where looking at what F# does would be really useful.

I had a few false starts installing F#, mainly because of the ridiculous amount of cruft I've got installed on my laptop. I have several different versions of Visual Studio installed, all of which have a corresponding VSIP SDK, so the F# installer had at least 6 different places it could have installed the VS extensions, but I eventually got it to go where I wanted it. (Hint: if you're in a similar situation, don't wait as long as I did before reading the README...D'Oh!)

So, let's look at the example that's been running through this discussion:

let doItTwice x y = x (x (y))

Unsurprisingly, F# infers the same type as the ML compiler did for the equivalent example in my earlier post:

val doItTwice: ('a -> 'a) -> 'a -> 'a

In other words, we can pass doItTwice some function that takes a parameter of type 'a, and that returns a value of the same type (this function is represented by the ('a -> 'a) part), and we can then pass in a parameter of type 'a, and doItTwice will return a value of type 'a. ML and F# use the convention of putting a single quote in front of a type name to indicate that this is a generic type parameter. (In college, I was told that in ML you pronounce 'a as 'alpha' and pretend that these type parameters are all Greek letters. And that fn is prounounced 'lambda'... Ah, the joy of parochial text handling.)

Let's see what F# compiles this as - remember F# targets .NET, so it's going to have to spit out a normal .NET assembly. Here's what we find:

.method public static !!A doItTwice<A>(class [fslib]Microsoft.FSharp.FastFunc`2<!!A,!!A> x,
!!A y) cil managed
{
// Code size 16 (0x10)
.maxstack 5
IL_0000: ldarg.0
IL_0001: ldarg.0
IL_0002: ldarg.1
IL_0003: callvirt instance !1 class [fslib]Microsoft.FSharp.FastFunc`2<!!A,!!A>::Invoke(!0)
IL_0008: tail.
IL_000a: callvirt instance !1 class [fslib]Microsoft.FSharp.FastFunc`2<!!A,!!A>::Invoke(!0)
IL_000f: ret
}

There's quite a lot going on here. I'll start with the method body because it's probably the easiest bit to get to grips with. Remember that doItTwice uses the function passed as the first parameter, calling it once on the second parameter, and then again on the returned value. And from a rough and ready glance at the IL, it looks like a fairly straightforward mapping. The only mysterious point is the exact representation of the function and the precise manner in which it is being invoked. (Although as an an aside, it's interesting to note the use of a tail call here - something you tend not to see if you work with more 'ordinary' languages.)

I think the representation of the function that gets passed in as a parameter is particularly interesting. This area is the part of my own compiler efforts that I've had the hardest time deciding on. I ran into a particular problem that I couldn't find a satisfactory way of solving. I suspect you run into this problem in any language where functions are routinely passed around as arguments and variables just like numbers and objects, especially where it's common practice to define a local function, in the way that you might define an anonymous method in C#. There's a problem with making this marry up with the CTS. (Common Type System, the CLI's type system.) You need to have delegate types able to refer to your functions, and that turns out to be rather tedious. To illustrate the problem, let's start with a simple example:

var func = delegate(int x, int y) { return x + y; };

The C# compiler will choke on this because it is unable to work out what delegate type it should use - we're using the new var keyword, which means the compiler has to infer the type, but it just doesn't know what it should do. We have to supply a specific delegate type. This works:

public delegate int BinaryIntFunction(int x, int y);
...
BinaryIntFunction func = delegate(int x, int y) { return x + y; };

But that seems pretty smelly. We defined a whole new delegate type just to be able to refer to this function. Not only is this inconvenient, it's also problematic if someone else comes along with their own equivalent delegate type:

public delegate T BinaryFunction<T>(T x, T y);
...
BinaryFunction<int> func = delegate(int x, int y) { return x + y; };

When we instantiate this generic BinaryFunction with int as a type parameter, it's sort of equivalent to the BinaryIntFunction in that we can assign the exact same anonymous method into either delegate type. And yet the two types are considered incompatible - you can't convert from one to the other despite the fact that both delegate types refer to functions of exactly the same signature!

So it's rather disturbing when you see the solutions the C# 3.0 and F# independently use for this problem. The System.Query.dll component that ships as part of LINQ contains the following definitions:

// Pseudocode - I don't have the LINQ source!
namespace System.Query
{
  public delegate T Func<T>(T a);
  public delegate T Func<T,A0>(A0 a);
  public delegate T Func<T,A0,A1>(A0 a, A1 b);
  public delegate T Func<T,A0,A1,A2>(A0 a, A1 b, A2 c);
  public delegate T Func<T,A0,A1,A2,A3>(A0 a, A1 b, A2 c, A3 d);
}

F# has a slightly different solution. They've introduced the FastFunc class you can see in the IL there. Its public face looks like this:

public abstract class FastFunc<T, U>
{
      // Methods
      protected FastFunc();
      public abstract U Invoke(T t);
      public static V InvokeFast2<V>(FastFunc<T, FastFunc<U, V>> f, T t, U u);
      public static W InvokeFast3<V, W>(FastFunc<T, FastFunc<U, FastFunc<V, W>>> f, T t, U u, V v);
      public static X InvokeFast4<V, W, X>(FastFunc<T, FastFunc<U, FastFunc<V, FastFunc<W, X>>>> f, T t, U u, V v, W w);
      public static implicit operator Converter<T, U>(FastFunc<T, U> f);
      public static implicit operator FastFunc<T, U>(Converter<T, U> f);
}

(Note that one of the benefits of using a class is that they've been able to supply a couple of implicit conversions. This FastFunc type is implicitly convertable to and from the standard System.Converter<T> delegate. While that's handy, it seems like a bit of a hack to work around the fact that the CTS doesn't have a built-in way to convert between structurally identical delegate types. Then again, it's consistent with the across-the-board lack of support for structural typing.)

Although this is a different strategy from the one used by C#, those InvokeFast[234] methods certainly give the impression of being a response to the same underlying problem. And with both of them I can't help wonder what happens when you need to use more than 4 parameters... (In C# this doesn't arise directly, because the compiler will only infer that you want method to be represented as one of these Func types if something asks explicitly to see it in that form. It doesn't presume you want to use Func if you do, say, (int i) => i + 1 as a lambda. And in F#, I have to admit my ML is rusty. All functions are curryable by default - so while doItTwice might look like a function that takes two parameters, it's not - it's actually a function that takes one parameter, and which evaluates to another function which takes the second parameter... I'm not sure how you go about writing something that really does take more than one parameter, so I don't know what happens when you go over 4.)

So although it is possible to implement this kind of thing on top of the CLI, it doesn't seem to be the most natural fit in the world. And different languages seem to be going merrily down their own path, with a different representation of a generic function reference for each language. Seems like it might be something where a CLI-level solution would be preferable, or at least a shared idiom we can all agree on. However, I suspect nobody is yet in a position to say what the best representation would be. This is why we need research projects like F#. :)

Copyright © 2002-2024, Interact Software Ltd. Content by Ian Griffiths. Please direct all Web site inquiries to webmaster@interact-sw.co.uk