What does vectorization mean?
Solution 1
Vectorization means that the compiler detects that your independent instructions can be executed as one SIMD instruction. Usual example is that if you do something like
for(i=0; i<N; i++){
a[i] = a[i] + b[i];
}
It will be vectorized as (using vector notation)
for (i=0; i<(N-N%VF); i+=VF){
a[i:i+VF] = a[i:i+VF] + b[i:i+VF];
}
Basically the compiler picks one operation that can be done on VF elements of the array at the same time and does this N/VF times instead of doing the single operation N times.
It increases performance, but puts more requirement on the architecture.
Solution 2
As mentioned above, vectorization is used to make use of SIMD instructions, which can perform identical operations of different data packed into large registers.
A generic guideline to enable a compiler to autovectorize a loop is to ensure that there are no flow- and anti-dependencies b/w data elements in different iterations of a loop.
http://en.wikipedia.org/wiki/Data_dependency
Some compilers like the Intel C++/Fortran compilers are capable of autovectorizing code. In case it was not able to vectorize a loop, the Intel compiler is capable of reporting why it could not do that. There reports can be used to modify the code such that it becomes vectorizable (assuming it's possible)
Dependencies are covered in depth in the book 'Optimizing Compilers for Modern Architectures: A Dependence-based Approach'
Solution 3
Vectorization need not be limited to single register which can hold large data. Like using '128' bit register to hold '4 x 32' bit data. It depends on architectural limitations. Some architecture have different execution units which have registers of their own. In that case, a part of the data can be fed to that execution unit and the result can be taken from a register corresponding to that execution unit.
For example, consider the below case.
for(i=0; i < N; i++)
{
a[i] = a[i] + b[i];
}
If I am working on an architecture which has two execution units, then my vector size is defined as two. The loop mentioned above will be reframed as
for(i=0; i<(N/2); i+=2)
{
a[i] = a[i] + b[i] ;
a[i+1] = a[i+1] + b[i+1];
}NOTE: The 2 inside the for statement is derived from the vector size.
As I am having two execution units the two statements inside the loop will be fed into the two execution units. The sum will be accumulated in the execution units separately. Finally the sum of accumulated values (from two execution units) will be carried out.
The good practices are
1. The constraints like dependency (between different iterations of the loop) needs to be checked before vectorizing the loop.
2. Function calls needs to be prevented.
3. Pointer access can create aliasing and it needs to be prevented.
Solution 4
It's SSE code Generation.
You have a loop with float matrix code in it matrix1[i][j] + matrix2[i][j] and the compiler generates SSE code.
Solution 5
Maybe also have a look at libSIMDx86 (source code).
A nice example well explained is:
Related videos on Youtube
vehomzzz
He adapts a lazy approach to a complex learning. His eclectic personality coupled with eccentric character caused much harm to masses, and to himself.
Updated on May 02, 2020Comments
-
vehomzzz about 4 years
Is it a good idea to vectorize the code? What are good practices in terms of when to do it? What happens underneath?
-
Stephen Canon over 14 years
-
-
Harsh Pathak over 14 yearsSo is there anything a programmer can do to ensure vectorization (apart from turning optimization on)?
-
Zed over 14 yearsAs far as I know compilers are limited in auto-vectorization, so your best bet is to keep your code as trivial as it can be. Also you can check the assembly code generated to see whether the compiler vectorized or not.
-
michelson over 14 years@jacob: you can't really "ensure" it, you might want to take a look at openmp.org for method of explicitly telling the compiler to vectorize.
-
Amok over 14 yearsSEE is not the only vector instruction set. PPC has Altivec and other architectures have their own vector instructions as well.
-
josesuero over 14 yearsVectorization doesn't mean that the compiler does it, just that SIMD instructions are used. When the compiler generates SIMD code it's generally called auto-vectorization.