What is the proper architecture-specific options (-m) for Sandy Bridge based Pentium?

13,355

Solution 1

What about detecting via GCC, for me (gcc-5.3.0) on an i5-2450M CPU (Lenovo e520), the following shows:

gcc -march=native -E -v - </dev/null 2>&1 | grep cc1


/usr/libexec/gcc/x86_64-pc-linux-gnu/5.3.0/cc1 -E -quiet -v - -march=sandybridge 
-mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 
-msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mno-abm -mno-lwp 
-mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mavx 
-mno-avx2 -msse4.2 -msse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd 
-mno-f16c -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr 
-mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd 
-mno-vx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves 
-mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma 
-mno-avx512vbmi -mno-clwb -mno-pcommit -mno-mwaitx --param 
l1-cache-size=32 --param l1-cache-line-size=64 --param 
l2-cache-size=3072 -mtune=sandybridge -fstack-protector-strong

Solution 2

I would suggest to use -march=corei7-avx -mtune=corei7-avx -mno-avx -mno-aes. It is important to specify -mtune because this option tells gcc which CPU model it should use for scheduling instructions in the generated code.

Solution 3

I hava a Sandy Bridge based Intel(R) Celeron(R) CPU G530.

When use -march=native in gentoo's CFLAGS, and then compile media-video/ffmpeg-1.2.6 (current stable version in Gentoo), there is something wrong when playing video with mplayer( illegal instruction). Just like what you said, -mtune=native sometimes misdetects features of CPU.

Then I change to -march=corei7-avx -mtune=corei7-avx -mno-avx -mno-aes, and recompile ffmpeg-1.2.6 and mplayer, things are all ok till now.

Share:
13,355
k2_8191
Author by

k2_8191

Updated on June 04, 2022

Comments

  • k2_8191
    k2_8191 almost 2 years

    I'm trying to figure out how to set -march option properly to see how much performance difference between the option enabled and disabled can occur on my PC with gcc 4.7.2.

    Before trying compiling, I tried to find what is the best -march option for my PC. My PC has Pentium G850, whose architecture is Sandy Bridge. So I referred to the gcc 4.7.2 manual and found that -march=corei7-avx seems the best.
    However, I remembered that Sandy Bridge based Pentium lacks AVX and AES-NI instruction set support, which is true for Pentium G850. So -march=corei7-avx is not a proper option.

    I come up with some potential options:

    1. -march=corei7-avx -mno-avx -mno-aes
    2. -march=corei7 -mtune=corei7-avx
    3. -march=native

    The first option looks reasonable considering information I have, but I'm anxious that there may be missing feature other than AVX and AES-NI. The second option looks safe, but it could miss some minor features on Sandy Bridge because of -march=corei7. The third option will take care of all of my concerns, but I've heard this option sometimes misdetects features of CPU so I would like to know how to manually do that.
    I've googled and searched StackOverflow and SuperUser, but I can't find any clear solutions...
    What options should be set?

  • k2_8191
    k2_8191 about 11 years
    Your answer looks same as my first option because -march=cpu-type implies -mtune=cpu-type (see -march=cpu-type section of the gcc manual). Is there difference between my first option and yours? Also, is there nothing to worry about missing features other than AVX and AES-NI?
  • Marat Dukhan
    Marat Dukhan about 11 years
    Probably low-end Sandy Bridges also lack PCLMULQDQ instruction, but the compiler will not generate it automatically anyway.
  • k2_8191
    k2_8191 about 11 years
    Really? I've thought code using such instruction if compiler detects some code is faster using that... Am I wrong?
  • Marat Dukhan
    Marat Dukhan about 11 years
    This is the case for most instructions, but PCLMULQDQ is too special, and compilers do not recognize where is could be beneficial.
  • k2_8191
    k2_8191 about 11 years
    Yes of course, but it is not completely safe, isn't it? The PCLMULQDQ is intended for programs who utilize cryptography, which is common for modern programs.
  • Marat Dukhan
    Marat Dukhan about 11 years
    PCLMULQDQ is carry-less multiplication, a very special operation. gcc does not recognize that some piece of code implements carry-less multiplication, and thus it can not replace this piece of code with PCLMULQDQ.