Why is this F# code so slow?

c# performance f# inline

22,713

The problem is that the min3 function is compiled as a generic function that uses generic comparison (I thought this uses just IComparable, but it is actually more complicated - it would use structural comparison for F# types and it's fairly complex logic).

> let min3(a, b, c) = min a (min b c);;
val min3 : 'a * 'a * 'a -> 'a when 'a : comparison

In the C# version, the function is not generic (it just takes int). You can improve the F# version by adding type annotations (to get the same thing as in C#):

let min3(a:int, b, c) = min a (min b c)

...or by making min3 as inline (in which case, it will be specialized to int when used):

let inline min3(a, b, c) = min a (min b c);;

For a random string str of length 300, I get the following numbers:

> levenshtein str ("foo" + str);;
Real: 00:00:03.938, CPU: 00:00:03.900, GC gen0: 275, gen1: 1, gen2: 0
val it : int = 3

> levenshtein_inlined str ("foo" + str);;
Real: 00:00:00.068, CPU: 00:00:00.078, GC gen0: 0, gen1: 0, gen2: 0
val it : int = 3

22,713

Author by

Robert Jeppesen

Updated on April 13, 2020

Comments

Robert Jeppesen about 4 years

A Levenshtein implementation in C# and F#. The C# version is 10 times faster for two strings of about 1500 chars. C#: 69 ms, F# 867 ms. Why? As far as I can tell, they do the exact same thing? Doesn't matter if it is a Release or a Debug build.

EDIT: If anyone comes here looking specifically for the Edit Distance implementation, it is broken. Working code is here.

C#:

private static int min3(int a, int b, int c)
{
   return Math.Min(Math.Min(a, b), c);
}

public static int EditDistance(string m, string n)
{
   var d1 = new int[n.Length];
   for (int x = 0; x < d1.Length; x++) d1[x] = x;
   var d0 = new int[n.Length];
   for(int i = 1; i < m.Length; i++)
   {
      d0[0] = i;
      var ui = m[i];
      for (int j = 1; j < n.Length; j++ )
      {
         d0[j] = 1 + min3(d1[j], d0[j - 1], d1[j - 1] + (ui == n[j] ? -1 : 0));
      }
      Array.Copy(d0, d1, d1.Length);
   }
   return d0[n.Length - 1];
}

F#:

let min3(a, b, c) = min a (min b c)

let levenshtein (m:string) (n:string) =
   let d1 = Array.init n.Length id
   let d0 = Array.create n.Length 0
   for i=1 to m.Length-1 do
      d0.[0] <- i
      let ui = m.[i]
      for j=1 to n.Length-1 do
         d0.[j] <- 1 + min3(d1.[j], d0.[j-1], d1.[j-1] + if ui = n.[j] then -1 else 0)
      Array.blit d0 0 d1 0 n.Length
   d0.[n.Length-1]

sashang almost 13 years

Why doesn't F# compile min3 as a function that takes int? It already knows enough type information at compile time to do this. This is how it would work if min3 was a C++ template function, so I'm a bit puzzled as to why F# doesn't do this.
Prakash almost 13 years

F# infers it to be as generic as possible, e.g. "for all types X that support comparison". inline works like a C++ template, which would specialize to int based on the call site.
Tomas Petricek almost 13 years

C++ templates behave essentially as F#'s inline. The reason why the default behavior is different is because it builds on .Net generics that are handled by the runtime (and, arguably, aren't so great for writing generic numeric code). Using the C++ behavior in F# would, however, lead to code bloat, because F# uses generics a lot more.
Admin about 12 years

C++ template semantics can lead to code bloat even in C++, and the lack of a convenient way to switch to using a run-time mechanism to avoid that is a hassle at times. However, the fear of code bloat is normally irrational - generally, C++ templates work well.
ildjarn about 12 years

@Steve314 : It's also generally easy to avoid by refactoring out all code that doesn't use a dependent type, so that code isn't duplicated for different instantiations.
Admin about 12 years

@ildjarn - That's not so "generally" as you claim in my experience, though of course that's almost entirely pre-C++11 and the new style makes a difference. I've written plenty of template containers where the guts were type-unsafe code using placement new, explicit destructors, offsetof and pointer arithmetic. There are workarounds that could avoid this, but they would defeat the whole purpose of those containers. That said, I'm not so sure that those containers were ever really that useful now - architecture astronaut tendencies, should really use vectors more and sort on need etc etc.
MWB over 3 years

On JVM, I'd expect min3 to get inlined by the JIT, assuming levenstein gets called many times. I wonder if .NET would do the same here. It seems unhelpful to benchmark one function invocation in a JIT'ed language.