Bonjour,
j'ai ce simple morceau de code pour calculer le maximum entre deux tableaux et placer le résultat dans un troisième :
Code:
1
2
3
4
5
6 template<class T> inline void Maximum(const T *array1, const T *array2, const size_t size, T *result) { for (size_t i=0 ; i < size ; i++) result[i] = array1[i] > array2[i] ? array1[i] : array2[i] ; }
Ce code fonctionne très bien, mais je souhaitais qu'il soit auto-vectorisé et il ne l'est pas.
Comment faire ?
J'avoue que ça me dépasse et je souhaiterai comprendre svp :-)
[EDIT]
Lorsque j'utilise le flag "-fopt-info-vec-missed", j'ai le message suivant disant que la boucle n'est pas auto-vectorisée, avec "ArrayMM.h:19:2:" faisant référence aux deux lignes en question :
En revanche lorsque j'utilise le flag "-fopt-info-vec", j'ai alors :Citation:
ArrayMM.h:19:2: note: versioning for alias required: can't determine dependence between *_31 and *_30
ArrayMM.h:19:2: note: versioning for alias required: can't determine dependence between *_33 and *_30
ArrayMM.h:19:2: note: Unknown misalignment, is_packed = 0
ArrayMM.h:19:2: note: Unknown misalignment, is_packed = 0
ArrayMM.h:19:2: note: Unknown misalignment, is_packed = 0
ArrayMM.h:19:2: note: virtual phi. skip.
ArrayMM.h:19:2: note: not ssa-name.
ArrayMM.h:19:2: note: use not simple.
ArrayMM.h:19:2: note: not ssa-name.
ArrayMM.h:19:2: note: use not simple.
ArrayMM.h:19:2: note: not ssa-name.
ArrayMM.h:19:2: note: use not simple.
ArrayMM.h:19:2: note: not ssa-name.
ArrayMM.h:19:2: note: use not simple.
ArrayMM.h:19:2: note: virtual phi. skip.
ArrayMM.h:19:2: note: virtual phi. skip.
...
ArrayMM.h:20:9: note: not consecutive access _160 = *_159;
ArrayMM.h:20:9: note: not consecutive access _162 = *_161;
ArrayMM.h:20:9: note: not consecutive access *_158 = iftmp.2_163;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
ArrayMM.h:20:9: note: not consecutive access _173 = *_172;
ArrayMM.h:20:9: note: not consecutive access _175 = *_174;
ArrayMM.h:20:9: note: not consecutive access *_171 = iftmp.2_176;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
ArrayMM.h:20:9: note: not consecutive access _88 = *_87;
ArrayMM.h:20:9: note: not consecutive access _90 = *_89;
ArrayMM.h:20:9: note: not consecutive access *_86 = iftmp.2_91;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
...
ArrayMM.h:20:21: note: not vectorized: no vectype for stmt: vect__32.38_132 = MEM[(int *)vectp.36_130];
scalar_type: vector(4) int
ArrayMM.h:20:21: note: not vectorized: not enough data-refs in basic block.
ArrayMM.h:20:9: note: not consecutive access _112 = *_111;
ArrayMM.h:20:9: note: not consecutive access _114 = *_113;
ArrayMM.h:20:9: note: not consecutive access *_110 = iftmp.2_115;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
arrayTiTi_ArrayMM.cpp:20:24: note: not vectorized: not enough data-refs in basic block.
ArrayMM.h:20:9: note: not consecutive access _74 = *_99;
ArrayMM.h:20:9: note: not consecutive access _30 = *_29;
ArrayMM.h:20:9: note: not consecutive access *_51 = iftmp.2_31;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
ArrayMM.h:20:9: note: not consecutive access _148 = *_147;
ArrayMM.h:20:9: note: not consecutive access _150 = *_149;
ArrayMM.h:20:9: note: not consecutive access *_146 = iftmp.2_151;
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
...
ArrayMM.h:20:9: note: SLP: step doesn't divide the vector-size.
ArrayMM.h:20:9: note: Unknown alignment for access: *_67
ArrayMM.h:20:9: note: SLP: step doesn't divide the vector-size.
ArrayMM.h:20:9: note: Unknown alignment for access: *_69
ArrayMM.h:20:9: note: SLP: step doesn't divide the vector-size.
ArrayMM.h:20:9: note: Unknown alignment for access: *_66
ArrayMM.h:20:9: note: Failed to SLP the basic block.
ArrayMM.h:20:9: note: not vectorized: failed to find SLP opportunities in basic block.
Qui semblerait dire que la boucle a été vectorisée...Citation:
ArrayMM.h:19:2: note: loop vectorized
ArrayMM.h:19:2: note: loop versioned for vectorization because of possible aliasing
ArrayMM.h:19:2: note: loop peeled for vectorization to enhance alignment