Intel Compiler versus GCC

11,420

Solution 1

I have no experience with the intel compiler so I can't answer whether you are missing some flags or not.

However from what I recall recent versions of gcc are generally as good at optimizing code as icc (sometimes better, sometimes worse (although most sources seem to indicate to generally better)), so you might have run into a situation where icc is particulary bad. Examples for what optimizations each compiler can do can be found here and here. Even if gcc is not generally better you could simply have a case which gcc recognizes for optimization and icc doesn't. Compilers can be very picky about what they optimize and what not, especially regarding things like autovectorization.

If your loop is small enough it might be worth it to compare the generated assembly code between gcc and icc. Also if you show some code or at least tell us what you are doing in your loop we might be able to give you better speculations what leads to this behaviour. For example in some situations. If it's a relatively small loop it is likely a case of icc missing one (or some, but probably not many) optimization which either have inherently good potential (prefetching, autovectorization, unrolling, loop invariant motion,...) or which enable other optimizations (primarily inlining).

Note that I'm only talking about optimization potential when I compare gcc to icc. In the end icc might typically generate faster code then gcc, but not so much because it does more optimizations, but because it has a faster standard library implementation and because it is smarter about where to optimize (on high optimization levels gcc gets a little bit overeager (or at least it used to) about trading code size for (theoretical) runtime improvements. This can actually hurt performance, e.g. when the carefully unrolled and vectorized loop is only ever executed with 3 iterations.

Solution 2

I normally use -inline-level=1 -inline-forceinline to make sure that functions which I have explicitly declared inline actually do get inlined. Other than that I would expect ICC performance to be at least as good as with gcc. You will need to profile your code to see where the performance difference is coming from. If this is Linux then I recommend using Zoom, which you can get on a free 30 day evaluation.

Share:
11,420
Ricky
Author by

Ricky

BY DAY: Embedded software designer. BY NIGHT: Open Source OS developer and fanatic 3D Print Guru.

Updated on June 04, 2022

Comments

  • Ricky
    Ricky almost 2 years

    When I compile an application with Intel's compiler it is slower than when I compile it with GCC. The Intel compiler's output is more than 2x slower. The application contains several nested loops. Are there any differences between GCC and the Intel compiler that I am missing? Do I need to turn on some other flags to improve the Intel compiler's performance? I expected the Intel compiler to be at least as fast as GCC.

    Compiler Versions:

     Intel version  12.0.0 20101006 
     GCC   version  4.4.4  20100630
    

    The compiler flags are the same with both compilers:

    -O3 -openmp -parallel -mSSE4.2 -Wall -pthread
    
  • Ricky
    Ricky over 12 years
    I tried the inlining flags but those don't make a difference. I guess i'll have to profile the code then, thanks for your answer!