Performance of expression trees

11,520

Solution 1

Compilation

The call to Expression.Compile goes through exactly the same process as any other .NET code your application contains in the sense that:

  • IL code is generated
  • IL code is JIT-ted to machine code

(the parsing step is skipped because an Expression Tree is already created and does not have to be generated from the input code)

You can look at the source code of the expression compiler to verify that indeed, IL code is generated.

Optimization

Please be aware that almost all of the optimization done by the CLR is done in the JIT step, not from compiling C# source code. This optimization will also be done when compiling the IL code from your lambda delegate to machine code.

Your example

In your example you are comparing apples & oranges. The first example is a method definition, the second example is runtime code that creates a method, compiles and executes it. The time it takes to create/compile the method is much longer than actually executing it. However you can keep an instance of the compiled method after creation. When you have done that, the performance of your generated method should be identical to that of the original C# method.

Consider this case:

private static int AddMethod(int a, int b)
{
    return a + b;
}

Func<int, int, int> add1 = (a, b) => a + b;
Func<int, int, int> add2 = AddMethod;

var x = Expression.Parameter(typeof (int));
var y = Expression.Parameter(typeof (int));
var additionExpr = Expression.Add(x, y);
Func<int, int, int> add3 = 
              Expression.Lambda<Func<int, int, int>>(
                  additionExpr, x, y).Compile();
//the above steps cost a lot of time, relatively.

//performance of these three should be identical
add1(1, 2);
add2(1, 2);
add3(1, 2);

So the conclusion one might draw is: IL code is IL code, no matter how it is generated, and Linq Expressions generate IL code.

Solution 2

Your Add function probably compiles to some function overhead (if not inlined) and a single add instruction. Doesn't get any faster than that.

Even constructing this expression tree is going to be orders of magnitude slower. Compiling a new function for each invocation is going to be incredibly expensive compared to the direct C# implementation.

Try compiling the function just once and storing it somewhere.

Solution 3

Trying to understand why my build and compiled lambda runs slightly slower than "just delegate" (I think I will need create new SO question for it) I've found this thread and have decided to check the performance using BenchmarkDotNet. Surprise for me: there build manually and compiled lambda is quickest. And yes -there is stable difference between methods.

Results:

BenchmarkDotNet=v0.10.5, OS=Windows 10.0.14393
Processor=Intel Core i5-2500K CPU 3.30GHz (Sandy Bridge), ProcessorCount=4
Frequency=3233539 Hz, Resolution=309.2587 ns, Timer=TSC
  [Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1648.0
  Clr    : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1648.0
  Core   : .NET Core 4.6.25009.03, 64bit RyuJIT


         Method |  Job | Runtime |      Mean |     Error |    StdDev |    Median |       Min |        Max | Rank | Allocated |
--------------- |----- |-------- |----------:|----------:|----------:|----------:|----------:|-----------:|-----:|----------:|
     AddBuilded |  Clr |     Clr | 0.8826 ns | 0.0278 ns | 0.0232 ns | 0.8913 ns | 0.8429 ns |  0.9195 ns |    1 |       0 B |
      AddLambda |  Clr |     Clr | 1.5077 ns | 0.0226 ns | 0.0212 ns | 1.4986 ns | 1.4769 ns |  1.5395 ns |    2 |       0 B |
 AddLambdaConst |  Clr |     Clr | 6.4535 ns | 0.0454 ns | 0.0425 ns | 6.4439 ns | 6.4030 ns |  6.5323 ns |    3 |       0 B |
     AddBuilded | Core |    Core | 0.8993 ns | 0.0249 ns | 0.0233 ns | 0.8908 ns | 0.8777 ns |  0.9506 ns |    1 |       0 B |
      AddLambda | Core |    Core | 1.5105 ns | 0.0241 ns | 0.0201 ns | 1.5094 ns | 1.4731 ns |  1.5396 ns |    2 |       0 B |
 AddLambdaConst | Core |    Core | 9.3849 ns | 0.2237 ns | 0.5693 ns | 9.6577 ns | 8.3455 ns | 10.0590 ns |    4 |       0 B |

I can't make any conclusions from this, it can be difference in IL code or JIT compiler impact.

Code:

    static BenchmarkLambdaSimple()
    {
        addLambda = (a, b) => a + b;
        addLambdaConst = AddMethod;

        var x = Expression.Parameter(typeof(int));
        var y = Expression.Parameter(typeof(int));
        var additionExpr = Expression.Add(x, y);
        addBuilded =
                      Expression.Lambda<Func<int, int, int>>(
                          additionExpr, x, y).Compile();
    }
    static Func<int, int, int> addLambda;
    static Func<int, int, int> addLambdaConst;
    static Func<int, int, int> addBuilded;
    private static int AddMethod(int a, int b)
    {
        return a + b;
    }

    [Benchmark]
    public int AddBuilded()
    {
        return addBuilded(1, 2);
    }

    [Benchmark]
    public int AddLambda()
    {
        return addLambda(1, 2);
    }

    [Benchmark]
    public int AddLambdaConst()
    {
        return addLambdaConst(1, 2);
    }

Solution 4

OK, I have written a little test (probably needs scrutinisation by you experts) but its seems as if expression trees are the fastest (add3), followed by add2 and then add1!

using System;
using System.Diagnostics;
using System.Linq.Expressions;

namespace ExpressionTreeTest
{
    class Program
    {
        static void Main()
        {
            Func<int, int, int> add1 = (a, b) => a + b;
            Func<int, int, int> add2 = AddMethod;

            var x = Expression.Parameter(typeof(int));
            var y = Expression.Parameter(typeof(int));
            var additionExpr = Expression.Add(x, y);
            var add3 = Expression.Lambda<Func<int, int, int>>(
                              additionExpr, x, y).Compile();


            TimingTest(add1, "add1", 1000000);
            TimingTest(add2, "add2", 1000000);
            TimingTest(add3, "add3", 1000000);
        }

        private static void TimingTest(Func<int, int, int> addMethod, string testMethod, int numberOfAdditions)
        {
            var sw = new Stopwatch();
            sw.Start();
            for (var c = 0; c < numberOfAdditions; c++)
            {
               addMethod(1, 2);              
            }
            sw.Stop();
           Console.WriteLine("Total calculation time {1}: {0}", sw.Elapsed, testMethod);
        }

        private static int AddMethod(int a, int b)
        {
            return a + b;
        }
    }
}

My results debug mode:

Total calculation time add1: 00:00:00.0134612
Total calculation time add2: 00:00:00.0133916
Total calculation time add3: 00:00:00.0053629

My results release mode:

Total calculation time add1: 00:00:00.0026172
Total calculation time add2: 00:00:00.0027046
Total calculation time add3: 00:00:00.0014334
Share:
11,520

Related videos on Youtube

cs0815
Author by

cs0815

Updated on September 15, 2022

Comments

  • cs0815
    cs0815 about 1 year

    My current understanding is that 'hard coded' code like this:

    public int Add(int x, int y) {return x + y;}
    

    will always perform better than expression tree code like this:

    Expression<Func<int, int, int>> add = (x, y) => x + y; 
    var result = add.Compile()(2, 3);
    
    var x = Expression.Parameter(typeof(int)); 
    var y = Expression.Parameter(typeof(int)); 
    return (Expression.Lambda(Expression.Add(x, y), x, y).
        Compile() as Func<int, int, int>)(2, 3);
    

    as the compiler has more information and can spend more effort on optimizing the code if you compile it at compile time. Is this generally true?

    • Thanasis Ioannidis
      Thanasis Ioannidis almost 9 years
      At first thought it sounds reasonable for 'hard coded' code to execute faster than 'runtime compiled' code. But actually that's not always the case. I have coded a test where my runtime generated code actually runs twice as fast as the 'hard coded' code (half the time to complete). The code was very simple. Assign a value to a class' instance property 10 million times (like=> ci.prop = a). Turns out that the expression compiled version is faster. I really don't know why, BUT I had to decorate my assembly with some attributes for that to work.
    • Thanasis Ioannidis
      Thanasis Ioannidis almost 9 years
      The attributes were [assembly: AllowPartiallyTrustedCallers] [assembly: SecurityTransparent] [assembly: SecurityRules(SecurityRuleSet.Level2, SkipVerificationInFullTrust = true)]. Without them, the whole expression compiled code was much (but not THAT much) slower than hard coded one.
    • Rob
      Rob over 8 years
      @Saysmaster How did you come up with this list of attributes? In a project where we use compiled ExpressionTrees, we get a speed up in many cases, but in other cases we get a performance regression, with the profiler indicating that the additional time is spent in JIT_MethodAccessCheck. Does this fit with your experience?
    • Thanasis Ioannidis
      Thanasis Ioannidis over 8 years
      @Rob yes, it has something to do with MethodAccessCheck indeed. I don't remember the details though, but it was security checks for sure. by decorating the assembly with the attributes above, the whole security check was eliminated.
  • cs0815
    cs0815 over 9 years
    Ignoring the simplicity of the add function, which is just an example, does this mean you agree with me? (-:
  • cs0815
    cs0815 over 9 years
    If I understand you correctly, do you say that both approaches should be equally efficient?
  • Bas
    Bas over 9 years
    @csetzkorn read my updated answer, they are not equally efficient until your refactor your code
  • usr
    usr over 9 years
    Yes. Expression trees are not optimized at all, basically. The IL they generate is being optimized by the JIT normally, however.
  • cs0815
    cs0815 over 9 years
    Considering that you have more votes, I presume that you are right (-: I might run a little test using both methods (repeating each 1000 times and keeping the compiled expression stuff in memory).
  • cs0815
    cs0815 over 9 years
    maybe you are wrong sorry - see my answer ... I guess all this requires a more thorough experiment (random numbers more complicated calculation ...). Any ideas?
  • Scott Chamberlain
    Scott Chamberlain over 9 years
    There are problems with your test, your numbers are within experimental error. Check if(sw.ElapsedTicks < numberOfAdditions) and you will see there where instances where you called sw.Start() then sw.Stop() and the timer never advanced. If you move sw outside of the for of the loop you will find that the run time is really about 4ms (for my machine) instead of 35ms (for my machine). Also if you run the program multiple times I was getting a different winner each time, so the time difference is under the noise you get from just running a C# program.
  • cs0815
    cs0815 over 9 years
    Ok I have updated the code and rerun it. Is this what you meant? I am afraid the results remains the same, I ran it 50 times and the ranking is the same each time! If you do not agree please provide some code and your results in terms of execution time or better ranking.
  • Scott Chamberlain
    Scott Chamberlain over 9 years
    Are you running it in release mode without the debugger attached? You should be getting about 0.005 for all 3
  • cs0815
    cs0815 over 9 years
    Good point. I have updated the 'answer'. I am sorry to say that the ranking results remain the same!
  • Aron
    Aron over 9 years
    It maybe true that the method itself should be just as efficient, however the invocation of the method has a tiny tiny effect. The static method requires fewer look ups to invoke than the Func<int, int>. The effect is marginal and only noticeable in extreme benchmarking.
  • user2864740
    user2864740 over 7 years
    While neat (as C# makes another tiny stab at emulating some more functional syntax), that's just sugar over a normal method. Nothing really to do with expression trees. The correct term is Expression Bodied Functions .. where this use of "expression" should not be confused with Expression<..>.
  • Roman Pokrovskij
    Roman Pokrovskij over 6 years
    the similar test for slightly different situation: stackoverflow.com/questions/44239127/…
  • RJ Thompson
    RJ Thompson over 5 years
    Keep in mind, If the direct call can be inlined, it will make a difference.