[OpenCL]Erreur de compilation de kernels. (Code similaire au C99)

Invité · 23/06/2015, 17h52

Salut, j'essaye de compiler quelque kernels avec openCL malheureusement étant nouveau dans la manipulation de cette technologie, je ne m'y connais pas du tout en matière de syntaxe, je sais juste que c'est l'équivalent du C99 malheureusement je me retrouve avec des erreurs en compilation et des warnings dont je ne comprend absolument pas la signification.

Bref, voici le code :

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
 
            std::string prog = "#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable\n"
                               "const int stepXSize = 4;\n"
                               "const int stepYSize = 4;\n"
                               "float16 multMat (float16 matA, float16 matB) {\n"
                                   "float16 matC;\n"
                                   "for (int x = 0; x < 4; ++x) {\n"
                                        "for (int y = 0; y < 4; ++y) {\n"
                                            "float value = 0;\n"
                                            "for (int k = 0; k < 4; ++k) {\n"
                                                "float elementA = matA[y * 4 + k];\n"
                                                "float elementB = matB[k * 4 + x];\n"
                                                "value += elementA * elementB;\n"
                                            "}\n"
                                            "matC[y * 4 + x] = value;\n"
                                        "}\n"
                                    "}\n"
                                    "return matC;\n"
                                "}\n"
                                "float4 addVec (float4 vecA, float4 vecB) {\n"
                                    "float4 result;\n"
                                    "for (int i = 0; i < 4; i++) {\n"
                                        "result[i] = vecA[i] + vecB[i];\n"
                                    "}\n"
                                    "return result;\n"
                                "}\n"
                                "float4 multVec (float vecA, float vecB) {\n"
                                    "float4 result;\n"
                                    "for (int i = 0; i < 4; i++) {\n"
                                        "result[i] = vecA[i] * vecB[i];\n"
                                    "}\n"
                                    "return result;\n"
                                "}\n"
                                "float4 multMatVec (float16 matA, float4 vecB) {\n"
                                   "float4 vecC;\n"
                                   "for (int i = 0; i < 4; ++i) {\n"
                                        "float value = 0;\n"
                                        "for (int j = 0; j < 4; ++j) {\n"
                                            "value += vecB[j] * matA[i][j];\n"
                                        "}\n"
                                        "vecC[i] = value;\n"
                                   "}\n"
                                   "return vecC;\n"
                                "}\n"
                                "float16 transpose(float16 matA) {\n"
                                    "float16 matT\n"
                                    "for (int i = 0; i < 4; ++i) {\n"
                                        "for (int j = 0; j < 4; ++j) {\n"
                                            "matT[i][j] = matA[j][i];\n"
                                        "}\n"
                                    "}\n"
                                    "return matT;\n"
                                "}\n"
                                "float  det (float4 mat) {\n"
                                    "return mat[0] * mat[2] - mat[1] * mat[3];\n"
                                "}\n"
                                "float min (float3 vec) {\n"
                                    "float cmin = vec.x;\n"
                                    "for (int i = 1; i < 3; ++i) {\n"
                                        "if (vec[i] < cmin)\n"
                                            "cmin = vec[i];\n"
                                    "}\n"
                                    "return cmin;\n"
                                "}\n"
                                "float max (float3 vec) {\n"
                                    "float cmax = vec.x;\n"
                                    "for (int i = 1; i < 3; ++i) {\n"
                                        "if (vec[i] > cmax)\n"
                                            "cmax = vec[i];\n"
                                    "}\n"
                                    "return cmax;\n"
                                "}\n"
                                "int equal(float4 v1, float4 v2) {\n"
                                    "return v1.x == v2.x && v1.y == v2.y && v1.z == v2.z;\n"
                                "}\n"
                                "float4 initEdge (const float2 v0, const float2 v1, const float2 origin, float4 oneStepX, float4 oneStepY) {\n"
 
                                       "int a = v0.y - v1.y, b = v1.x - v0.x;\n"
                                       "int c = v0.x*v1.y - v0.y*v1.x;\n"
 
                                       "oneStepX = (float4) (a * stepXSize, a * stepXSize, a * stepXSize, a * stepXSize);\n"
                                       "oneStepY = (float4) (b * stepYSize, b * stepYSize, b * stepYSize, b * stepYSize);\n"
 
                                       "float4 x = addVec(float4 (origin.x, origin.x, origin.x, origin.x), float4(0,1,2,3));\n"
                                       "float4 y = addVec(float4 (origin.y, origin.y, origin.y, origin.y), float4(0,1,2,3));\n"
 
                                       "return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));\n"
                                "}"
                                "__kernel void vertexShader(__global float* vPosX, __global float* vPosY, __global float* vPosZ, __global float* vPosW,\n"
                                                           "__global unsigned int* indices,  __global unsigned int numIndices, __global unsigned int* baseIndices,\n"
                                                           "__global unsigned int* baseVertices,  __global unsigned int* nbVerticesPerFace,\n"
                                                           "__global float* transfMatrices, __global float16 projMatrix, __global float16 viewMatrix,\n"
                                                           "__global float16 viewportMatrix) {\n"
                                    "size_t tid = get_global_id(0);\n"
                                    "int instanceID = tid / nbVerticesPerFace[0];\n"
                                    "int offset = tid % nbVerticesPerFace[0];\n"
                                    "float16 transfMatrix;\n"
                                    "float4 position = (float4) (vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                    "vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);\n"
                                    "for (int i = 0; i < 16; i++) {\n"
                                        "transfMatrix[i] = transfMatrices[instanceID*16+i];\n"
                                    "}\n"
                                    "float4 worldcoords = multMatVec(transfMatrix, position);\n"
                                    "float4 viewcoords = multMatVec(viewMatrix, worldcoords);\n"
                                    "float4 clipcoords = multMatVec(projMatrix, viewcoords);\n"
                                    "float4 ndcCoords = (float4) (clipcoords.x / clipcoords.w, clipcoords.y / clipcoords.w, clipcoords.z / clipcoords.w, 1 / clipcoords.w);\n"
                                    "position = multMatVec(viewportMatrix, ndcCoords);\n"
                                    "vPosX[tid] = position.x;\n"
                                    "vPosY[tid] = position.y;\n"
                                    "vPosZ[tid] = position.z;\n"
                                    "vPosW[tid] = position.w;\n"
                                "}\n"
                                "__kernel void geometryShader (__global float* vPosX, __global float* vPosY, __global float* vPosZ, __global float* vPosW,\n"
                                "                              __global float* outvPosX, __global float* outvPosY, __global float* outvPosZ, __global float* outvPosW,\n"
                                "                              __global unsigned int* indices,  __global unsigned int numIndices, __global unsigned int* baseIndices,\n"
                                "                              __global unsigned int* baseVertices, __global unsigned int* nbVerticesPerFace) {\n"
                                "   size_t tid = get_global_id(0);\n"
                                "   int instanceID = tid / nbVerticesPerFace[0];\n"
                                "   int offset = tid % nbVerticesPerFace[0];\n"
                                "   if (get_global_id(0) == numIndices-1) {\n"
                                "       if (nbVerticesPerFace[1] == 1) {\n"
                                "           float centerLineX=0, centerLineY=0, centerLineZ=0, centerLineW=0;\n"
                                "           float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                "           vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);\n"
                                "           float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],\n"
                                "           vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);\n"
                                "           centerLineX = (v1.x + v2.x) * 0.5;\n"
                                "           centerLineY = (v1.y + v2.y) * 0.5;\n"
                                "           centerLineZ = (v1.z + v2.z) * 0.5;\n"
                                "           centerLineW = (v1.w + v2.w) * 0.5;\n"
                                "           outvPosX[tid*2+1] = centerLineX;\n"
                                "           outvPosY[tid*2+1] = centerLineY;\n"
                                "           outvPosZ[tid*2+1] = centerLineZ;\n"
                                "           outvPosW[tid*2+1] = centerLineW;\n"
                                "       } else {\n"
                                "           float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                "           vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);\n"
                                "           outvPosX[tid*2] = v.x;\n"
                                "           outvPosY[tid*2] = v.y;\n"
                                "           outvPosZ[tid*2] = v.z;\n"
                                "           outvPosW[tid*2] = v.w;\n"
                                "       }\n"
                                "    } else {\n"
                                "       tid2 = tid + 1;\n"
                                "       int instanceID2 = tid2 / nbVerticesPerFace[0];\n"
                                "       int offset2 = tid2 % nbVerticesPerFace[0];\n"
                                "       float centerLineX=0, centerLineY=0, centerLineZ=0, centerLineW=0;\n"
                                "       float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                "       vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);\n"
                                "       float4 v2(vPosX[baseVertices[instanceID2]+indices[baseIndices[instanceID2]+offset2]], vPosY[baseVertices[instanceID2]+indices[baseIndices[instanceID2]+offset2]],\n"
                                "       vPosZ[baseVertices[instanceID2]	+ indices[baseIndices[instanceID2]+offset2],vPosW[baseVertices[instanceID2]+indices[baseIndices[instanceID2]+offset2]]);\n"
                                "       centerLineX = (v1.x + v2.x) * 0.5;\n"
                                "       centerLineY = (v1.y + v2.y) * 0.5;\n"
                                "       centerLineZ = (v1.z + v2.z) * 0.5;\n"
                                "       centerLineW = (v1.w + v2.w) * 0.5;\n"
                                "       outvPosX[tid*2] = v1.x;\n"
                                "       outvPosY[tid*2] = v1.y;\n"
                                "       outvPosZ[tid*2] = v1.z;\n"
                                "       outvPosW[tid*2] = v1.w;\n"
                                "       outvPosX[tid*2+1] = centerLineX;\n"
                                "       outvPosY[tid*2+1] = centerLineY;\n"
                                "       outvPosZ[tid*2+1] = centerLineZ;\n"
                                "       outvPosW[tid*2+1] = centerLineW;\n"
                                "   }\n"
                                "}\n"
                                "__kernel void tesslationShader(__global float* vPosX, __global float* vPosY, __global float* vPosZ, __global float* vPosW,\n"
                                "                               __global unsigned int* nbVerticesPerFaces, __global unsigned float* centersX, __global unsigned float* centersY,\n"
                                "                               __global unsigned float* centersZ, __global unsigned float* centersW) {\n"
                                "   size_t tid = get_global_id(0);\n"
                                "   float centerX=0, centerY=0, centerZ=0, centerW = 0;\n"
                                "   for (unsigned int i = 0; i < nbVerticesPerFace[2]; i++) {\n"
                                "        centerX += vPosX[tid*nbVerticesPerFace[2]+i];\n"
                                "        centerY += vPosY[tid*nbVerticesPerFace[2]+i];\n"
                                "        centerZ += vPosZ[tid*nbVerticesPerFace[2]+i];\n"
                                "        centerW += vPosW[tid*nbVerticesPerFace[2]+i];\n"
                                "   }\n"
                                "   centerX = centerX / nbVerticesPerFace[2];\n"
                                "   centerY = centerY / nbVerticesPerFace[2];\n"
                                "   centerZ = centerZ / nbVerticesPerFace[2];\n"
                                "   centerW = centerW / nbVerticesPerFace[2];\n"
                                "   centersX[tid] = centerX;\n"
                                "   centersY[tid] = centerY;\n"
                                "   centersZ[tid] = centerZ;\n"
                                "   centersW[tid] = centerW;\n"
                                "}";

Voici ce que j'ai dans mon fichier .log :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
 
"/tmp/OCL9227T1.cl", line 2: warning: global variable declaration is corrected
          by the compiler to have addrSpace constant
  const int stepXSize = 4;
            ^
 
"/tmp/OCL9227T1.cl", line 3: warning: global variable declaration is corrected
          by the compiler to have addrSpace constant
  const int stepYSize = 4;
            ^
 
"/tmp/OCL9227T1.cl", line 10: error: vector subscript not support
  float elementA = matA[y * 4 + k];
                        ^
 
"/tmp/OCL9227T1.cl", line 11: error: vector subscript not support
  float elementB = matB[k * 4 + x];
                        ^
 
"/tmp/OCL9227T1.cl", line 14: error: vector subscript not support
  matC[y * 4 + x] = value;
       ^
 
"/tmp/OCL9227T1.cl", line 17: warning: variable "matC" is used before its
          value is set
  return matC;
         ^
 
"/tmp/OCL9227T1.cl", line 22: error: vector subscript not support
  result[i] = vecA[i] + vecB[i];
         ^
 
"/tmp/OCL9227T1.cl", line 22: error: vector subscript not support
  result[i] = vecA[i] + vecB[i];
                   ^
 
"/tmp/OCL9227T1.cl", line 22: error: vector subscript not support
  result[i] = vecA[i] + vecB[i];
                             ^
 
"/tmp/OCL9227T1.cl", line 24: warning: variable "result" is used before its
          value is set
  return result;
         ^
 
"/tmp/OCL9227T1.cl", line 29: error: vector subscript not support
  result[i] = vecA[i] * vecB[i];
         ^
 
"/tmp/OCL9227T1.cl", line 29: error: expression must have pointer-to-object
          type
  result[i] = vecA[i] * vecB[i];
              ^
 
"/tmp/OCL9227T1.cl", line 29: error: expression must have pointer-to-object
          type
  result[i] = vecA[i] * vecB[i];
                        ^
 
"/tmp/OCL9227T1.cl", line 31: warning: variable "result" is used before its
          value is set
  return result;
         ^
 
"/tmp/OCL9227T1.cl", line 38: error: vector subscript not support
  value += vecB[j] * matA[i][j];
                ^
 
"/tmp/OCL9227T1.cl", line 38: error: vector subscript not support
  value += vecB[j] * matA[i][j];
                          ^
 
"/tmp/OCL9227T1.cl", line 40: error: vector subscript not support
  vecC[i] = value;
       ^
 
"/tmp/OCL9227T1.cl", line 42: warning: variable "vecC" is used before its
          value is set
  return vecC;
         ^
 
"/tmp/OCL9227T1.cl", line 46: error: expected a ";"
  for (int i = 0; i < 4; ++i) {
  ^
 
"/tmp/OCL9227T1.cl", line 51: warning: parsing restarts here after previous
          syntax error
  return matT;
             ^
 
"/tmp/OCL9227T1.cl", line 52: warning: missing return statement at end of
          non-void function "transpose"
  }
  ^
 
"/tmp/OCL9227T1.cl", line 45: warning: variable "matT" was declared but never
          referenced
  float16 matT
          ^
 
"/tmp/OCL9227T1.cl", line 54: error: vector subscript not support
  return mat[0] * mat[2] - mat[1] * mat[3];
             ^
 
"/tmp/OCL9227T1.cl", line 54: error: vector subscript not support
  return mat[0] * mat[2] - mat[1] * mat[3];
                      ^
 
"/tmp/OCL9227T1.cl", line 54: error: vector subscript not support
  return mat[0] * mat[2] - mat[1] * mat[3];
                               ^
 
"/tmp/OCL9227T1.cl", line 54: error: vector subscript not support
  return mat[0] * mat[2] - mat[1] * mat[3];
                                        ^
 
"/tmp/OCL9227T1.cl", line 56: error: declaration is incompatible with
          overloaded function "min"
  float min (float3 vec) {
        ^
 
"/tmp/OCL9227T1.cl", line 59: error: vector subscript not support
  if (vec[i] < cmin)
          ^
 
"/tmp/OCL9227T1.cl", line 60: error: vector subscript not support
  cmin = vec[i];
             ^
 
"/tmp/OCL9227T1.cl", line 64: error: declaration is incompatible with
          overloaded function "max"
  float max (float3 vec) {
        ^
 
"/tmp/OCL9227T1.cl", line 67: error: vector subscript not support
  if (vec[i] > cmax)
          ^
 
"/tmp/OCL9227T1.cl", line 68: error: vector subscript not support
  cmax = vec[i];
             ^
 
"/tmp/OCL9227T1.cl", line 80: error: type name is not allowed
  float4 x = addVec(float4 (origin.x, origin.x, origin.x, origin.x), float4(0,1,2,3));
                    ^
 
"/tmp/OCL9227T1.cl", line 80: error: type name is not allowed
  float4 x = addVec(float4 (origin.x, origin.x, origin.x, origin.x), float4(0,1,2,3));
                                                                     ^
 
"/tmp/OCL9227T1.cl", line 81: error: type name is not allowed
  float4 y = addVec(float4 (origin.y, origin.y, origin.y, origin.y), float4(0,1,2,3));
                    ^
 
"/tmp/OCL9227T1.cl", line 81: error: type name is not allowed
  float4 y = addVec(float4 (origin.y, origin.y, origin.y, origin.y), float4(0,1,2,3));
                                                                     ^
 
"/tmp/OCL9227T1.cl", line 82: error: type name is not allowed
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                               ^
 
"/tmp/OCL9227T1.cl", line 82: error: argument of type "float4" is incompatible
          with parameter of type "float"
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                                                   ^
 
"/tmp/OCL9227T1.cl", line 82: error: type name is not allowed
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                                                               ^
 
"/tmp/OCL9227T1.cl", line 82: error: mixed vector-scalar operation not allowed
          unless up-convertable(scalar-type=>vector-element-type)
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                                                                                  ^
 
"/tmp/OCL9227T1.cl", line 82: error: too few arguments in function call
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                                                                                   ^
 
"/tmp/OCL9227T1.cl", line 82: error: type name is not allowed
  return addVec(addVec(multVec(float4(a, a, a, a), x), multVec(float4(b, b, b, b)*y)), float4(c, c, c, c));
                                                                                       ^
 
"/tmp/OCL9227T1.cl", line 75: warning: parameter "oneStepX" was set but never
          used
  float4 initEdge (const float2 v0, const float2 v1, const float2 origin, float4 oneStepX, float4 oneStepY) {
                                                                                 ^
 
"/tmp/OCL9227T1.cl", line 75: warning: parameter "oneStepY" was set but never
          used
  float4 initEdge (const float2 v0, const float2 v1, const float2 origin, float4 oneStepX, float4 oneStepY) {
                                                                                                  ^
 
"/tmp/OCL9227T1.cl", line 84: error: a parameter cannot be allocated in a
          named address space
  __global unsigned int* indices,  __global unsigned int numIndices, __global unsigned int* baseIndices,
                                   ^
 
"/tmp/OCL9227T1.cl", line 86: error: a parameter cannot be allocated in a
          named address space
  __global float* transfMatrices, __global float16 projMatrix, __global float16 viewMatrix,
                                  ^
 
"/tmp/OCL9227T1.cl", line 86: error: a parameter cannot be allocated in a
          named address space
  __global float* transfMatrices, __global float16 projMatrix, __global float16 viewMatrix,
                                                               ^
 
"/tmp/OCL9227T1.cl", line 87: error: a parameter cannot be allocated in a
          named address space
  __global float16 viewportMatrix) {
  ^
 
"/tmp/OCL9227T1.cl", line 95: error: vector subscript not support
  transfMatrix[i] = transfMatrices[instanceID*16+i];
               ^
 
"/tmp/OCL9227T1.cl", line 97: warning: variable "transfMatrix" is used before
          its value is set
  float4 worldcoords = multMatVec(transfMatrix, position);
                                  ^
 
"/tmp/OCL9227T1.cl", line 109: error: a parameter cannot be allocated in a
          named address space
                                __global unsigned int* indices,  __global unsigned int numIndices, __global unsigned int* baseIndices,
                                                                 ^
 
"/tmp/OCL9227T1.cl", line 117: error: parameter "vPosX" is not a type name
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                       ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                             ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                          ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                              ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                          ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                      ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                      ^
 
"/tmp/OCL9227T1.cl", line 117: error: parameter "vPosY" is not a type name
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                      ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                                   ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                                                       ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                                                                   ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                                                                               ^
 
"/tmp/OCL9227T1.cl", line 117: error: expression must have a constant value
             float4 v1(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                                                                               ^
 
"/tmp/OCL9227T1.cl", line 118: error: parameter "vPosZ" is not a type name
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
             ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                   ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                    ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                            ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                            ^
 
"/tmp/OCL9227T1.cl", line 118: error: parameter "vPosW" is not a type name
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                     ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                           ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                                        ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                                                            ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                                                                        ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                                                                                    ^
 
"/tmp/OCL9227T1.cl", line 118: error: expression must have a constant value
             vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]);
                                                                                                                    ^
 
"/tmp/OCL9227T1.cl", line 119: error: parameter "vPosX" is not a type name
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                       ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                             ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                     ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                             ^
 
"/tmp/OCL9227T1.cl", line 119: error: expected a "]"
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                    ^
 
"/tmp/OCL9227T1.cl", line 119: error: parameter "vPosY" is not a type name
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                      ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                            ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                                                    ^
 
"/tmp/OCL9227T1.cl", line 119: error: expression must have a constant value
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                                            ^
 
"/tmp/OCL9227T1.cl", line 119: error: expected a "]"
             float4 v2(vPosX[baseVertices[0]+indices[baseIndices[0]], vPosY[baseVertices[0]+indices[baseIndices[0]],
                                                                                                                   ^
 
"/tmp/OCL9227T1.cl", line 120: error: parameter "vPosZ" is not a type name
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
             ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                   ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	          ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	  ^
 
"/tmp/OCL9227T1.cl", line 120: error: expected a "]"
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                         ^
 
"/tmp/OCL9227T1.cl", line 120: error: parameter "vPosW" is not a type name
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                          ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                                ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                                                        ^
 
"/tmp/OCL9227T1.cl", line 120: error: expression must have a constant value
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                                                ^
 
"/tmp/OCL9227T1.cl", line 120: error: expected a "]"
             vPosZ[baseVertices[0]	+ indices[baseIndices[0]],vPosW[baseVertices[0]+indices[baseIndices[0]]);
                                  	                                                                       ^
 
"/tmp/OCL9227T1.cl", line 121: error: expression must have struct or union type
             centerLineX = (v1.x + v2.x) * 0.5;
                            ^
 
"/tmp/OCL9227T1.cl", line 121: error: expression must have struct or union type
             centerLineX = (v1.x + v2.x) * 0.5;
                                   ^
 
"/tmp/OCL9227T1.cl", line 122: error: expression must have struct or union type
             centerLineY = (v1.y + v2.y) * 0.5;
                            ^
 
"/tmp/OCL9227T1.cl", line 122: error: expression must have struct or union type
             centerLineY = (v1.y + v2.y) * 0.5;
                                   ^
 
"/tmp/OCL9227T1.cl", line 123: error: expression must have struct or union type
             centerLineZ = (v1.z + v2.z) * 0.5;
                            ^
 
"/tmp/OCL9227T1.cl", line 123: error: expression must have struct or union type
             centerLineZ = (v1.z + v2.z) * 0.5;
                                   ^
 
"/tmp/OCL9227T1.cl", line 124: error: expression must have struct or union type
             centerLineW = (v1.w + v2.w) * 0.5;
                            ^
 
"/tmp/OCL9227T1.cl", line 124: error: expression must have struct or union type
             centerLineW = (v1.w + v2.w) * 0.5;
                                   ^
 
"/tmp/OCL9227T1.cl", line 130: error: parameter "vPosX" is not a type name
             float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                      ^
 
"/tmp/OCL9227T1.cl", line 130: error: expression must have a constant value
             float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                            ^
 
"/tmp/OCL9227T1.cl", line 130: error: expression must have a constant value
             float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                         ^
 
"/tmp/OCL9227T1.cl", line 130: error: expression must have a constant value
             float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                             ^
 
"/tmp/OCL9227T1.cl", line 130: error: expression must have a constant value
             float4 v(vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],
                                                                         ^
 
Error limit reached.
100 errors detected in the compilation of "/tmp/OCL9227T1.cl".
Compilation terminated.
 
Frontend phase failed compilation.

Merci d'avance pour votre aide.

**koala01** · 24/06/2015, 02h44

Salut,

Je ne connais absolument pas openCL, donc, je dis peut être bien une connerie manifeste

Mais, si on ignore les deux premières lignes du log parce qu'il ne s'agit que d'avertissement, on lit que, ce qui chiffonne le compilateur en premier est ton utilisation "uni - dimentionnelle" des indices, sous la forme de matA[y * 4 + k].

Du coup, ma première réaction serait de me dire openCL utilise peut être systématiquement des matrice bi dimentionnelles et essayer de modifier le code pour qu'il prenne la forme de

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
 
"float elementA = matA[y][k];\n"
"float elementB = matB[k][x];\n"

(lignes 10 et 11)
Seulement, je serais surpris que tu n'ai pas déjà essayé cette écriture

Mais qui sait... Est-ce que cela ne fait pas commencer tes erreurs ailleurs

**ternel** · 24/06/2015, 08h38

float16, c'est un type de matrice? je croyais que c'était un type de nombre.

parce qu'en gros, ton code ressemble à ceci:

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
float16 matA =...;
float16 res = matA[4];

Alors si float16 est un flottant sur 16bits (ce qui est très probable, vu le nom), ca ne compile pas du tout

Invité · 24/06/2015, 09h21

Salut,

alors pour les float4 et float16 c'étaient des types de nombres et non pas des types matrices, alors j'ai corrigé tout le code en remplaçant les float16 et float4 par des tableaux. (Trop l'habitude du GLSL :/)

Maintenant ça compile à l'exception d'une erreur que je ne comprend pas :
Voici la ligne ou ça plante :

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
 
float  det (float* mat, const int n) {
        int d=0, p, h, k, i, j;
        const int m = n * n;
        float temp[m];

Et voici l'erreur :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
 
"/tmp/OCL4734T1.cl", line 2: warning: global variable declaration is corrected
          by the compiler to have addrSpace constant
  const int stepXSize = 4;
            ^
 
"/tmp/OCL4734T1.cl", line 3: warning: global variable declaration is corrected
          by the compiler to have addrSpace constant
  const int stepYSize = 4;
            ^
 
"/tmp/OCL4734T1.cl", line 46: error: expression must have a constant value
  float temp[m];
             ^
 
"/tmp/OCL4734T1.cl", line 219: warning: null (zero) character in input line
          ignored
  } 
   ^
 
1 error detected in the compilation of "/tmp/OCL4734T1.cl".
 
Frontend phase failed compilation.

Pourquoi il me dit que l'expression doit être une constante ?
Pourtant j'ai déclaré n et m comme étant constant.

Merci d'avance pour votre aide.

PS : En fait le code c'est du C à l'exception que malloc et free ne sont pas supportés.

**ternel** · 24/06/2015, 09h37

Parce que la déclaration d'un tableau doit utiliser une taille connue à la compilation.

C'est la valeur qui doit être constante, pas le type de la variable (même si c'est souvent impliqué).
et la valeur de n n'est pas constante.

Par contre, tu as a priori peu de valeur légitime de n.
Tu dois pouvoir utiliser un switch ou des if.

L'absence de malloc/free n'empeche pas d'utiliser un pointeur.

ps: et que sont toutes ces variables (d, p, h, k) (je supposes que i et j sont des pas de boucles)

Invité · 24/06/2015, 10h10

Pourtant n est connu à la compilation étant donné que lorsque j'appelle la fonction je donne une valeur constante à n.

Ok la valeur de n varie à chaque appel de la fonction mais dans ce cas ça recrée à chaque fois un nouveau tableau dans la fonction.

Peut être faut t'il déclarer la fonction inline,
mais je ne peux pas vu qu'elle est récursive.

La fonction calcule le déterminant d'une matrice de taille n*n en utilisant la règle du mineur et des co-facteurs.

Voilà pourquoi j'ai déclaré toutes ses autres variables. (d, p, h, k)

**ternel** · 24/06/2015, 10h55

Mais une valeur qui change n'est pas connue à la compilation.
la valeur doit être une constante au niveau de l'instruction.
Ce n'est clairement pas le cas, vu qu'il y a récursion.

Invité · 24/06/2015, 11h32

Ha ok donc c'est au niveau de l'instruction que la valeur doit être constante et pas au niveau de la fonction.

Ok, j'ai mis une valeur constante au niveau de l'instruction et là ça compile.

Par contre j'ai un crash lors de la lecture des valeurs dans le buffer de sortie.

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 
"__kernel void vertexShader(__global float* vPosX, __global float* vPosY, __global float* vPosZ, __global float* vPosW,\n"
                                                           "__global float* outvPosX, __global float* outvPosY, __global float* outvPosZ, __global float* outvPosW,"
                                                           "__global unsigned int* indices,  __global unsigned int* numIndices, __global unsigned int* baseIndices,\n"
                                                           "__global unsigned int* baseVertices,  __global unsigned int* nbVerticesPerFace,\n"
                                                           "__global float* transfMatrices, __global float* projMatrix, __global float* viewMatrix,\n"
                                                           "__global float* viewportMatrix) {\n"
                                    "size_t tid = get_global_id(0);\n"
                                    "int instanceID = tid / nbVerticesPerFace[0];\n"
                                    "int offset = tid % nbVerticesPerFace[0];\n"
                                    "float transfMatrix[16];\n"
                                    "float position[4] = {vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                    "vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]};\n"
                                    "for (int i = 0; i < 16; i++) {\n"
                                        "transfMatrix[i] = transfMatrices[instanceID*16+i];\n"
                                    "}\n"
                                    "float worldcoords[4];\n;"
                                    "multMatVec(transfMatrix, position, worldcoords, 4);\n"
                                    "float viewcoords[4];\n"
                                    "multMatVec(viewMatrix, worldcoords, viewcoords, 4);\n"
                                    "float clipcoords[4];\n"
                                    "multMatVec(projMatrix, viewcoords, clipcoords, 4);\n"
                                    "float ndcCoords[] = {clipcoords[0] / clipcoords[3], clipcoords[1] / clipcoords[3], clipcoords[2] / clipcoords[3], 1 / clipcoords[3]};\n"
                                    "float finalPos[4];"
                                    "multMatVec(viewportMatrix, ndcCoords, finalPos, 4);\n"
                                    "outvPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[0];\n"
                                    "outvPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[1];\n"
                                    "outvPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[2];\n"
                                    "outvPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[3];\n"
                                "}\n"

La plantage survient ici :

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 
cl::Event event;
                err = clqueue.enqueueNDRangeKernel(clkvertexShader,
                                                    1, workgroupSize,
                                                    1, nullptr, &event);
                checkErr(err, "ComamndQueue::enqueueNDRangeKernel()");
                event.wait();
err = clqueue.enqueueReadBuffer(cvposXBuffer,CL_TRUE,0,workgroupSize,vvPosX.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(cvposYBuffer,CL_TRUE,0,workgroupSize,vvPosY.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(cvposZBuffer,CL_TRUE,0,workgroupSize,vvPosZ.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(cvposWBuffer,CL_TRUE,0,workgroupSize,vvPosW.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");

Voici l'endroit ou ça plante. (Si quelqu'un ici connait l'assembleur)

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
 
0x7fffea1c8503	mov    r8d,ecx
0x7fffea1c8506	shl    r8d,0x4
0x7fffea1c850a	mov    eax,DWORD PTR [rbx+rax*4]
0x7fffea1c850d	add    eax,DWORD PTR [r9+rcx*4]
0x7fffea1c8511	mov    rcx,QWORD PTR [rsp-0x68]
0x7fffea1c8516	movss  xmm9,DWORD PTR [rcx+rax*4]
0x7fffea1c851c	xor    ebp,ebp
0x7fffea1c851e	mov    rcx,QWORD PTR [rsp-0x70]
0x7fffea1c8523	movss  xmm5,DWORD PTR [rcx+rax*4]
0x7fffea1c8528	mov    rcx,QWORD PTR [rsp-0x80]
0x7fffea1c852d	movss  xmm6,DWORD PTR [rcx+rax*4]
0x7fffea1c8532	mov    rcx,QWORD PTR [rsp-0x78]
0x7fffea1c8537	movss  xmm7,DWORD PTR [rcx+rax*4]
0x7fffea1c853c	nop    DWORD PTR [rax+0x0]
0x7fffea1c8540	lea    rax,[r8+rbp*1]
0x7fffea1c8544	movsxd rsi,eax
0x7fffea1c8547	movss  xmm0,DWORD PTR [r10+rsi*4]
0x7fffea1c854d	movss  DWORD PTR [rsp+rbp*4+0x8],xmm0
0x7fffea1c8553	lea    eax,[rsi+0x1]
0x7fffea1c8556	movsxd rax,eax
0x7fffea1c8559	movss  xmm0,DWORD PTR [r10+rax*4]
0x7fffea1c855f	movss  DWORD PTR [rsp+rbp*4+0xc],xmm0
0x7fffea1c8565	lea    eax,[rsi+0x2]
0x7fffea1c8568	movsxd rcx,eax
0x7fffea1c856b	lea    eax,[rsi+0x6]
0x7fffea1c856e	lea    rdi,[rbp+0x8]
0x7fffea1c8572	movss  xmm0,DWORD PTR [r10+rcx*4]
0x7fffea1c8578	movss  DWORD PTR [rsp+rbp*4+0x10],xmm0
0x7fffea1c857e	lea    ebx,[rsi+0x4]
0x7fffea1c8581	lea    ecx,[rsi+0x5]
0x7fffea1c8584	lea    edx,[rsi+0x7]
0x7fffea1c8587	movsxd r14,edx
0x7fffea1c858a	movsxd rax,eax
0x7fffea1c858d	movsxd r13,ecx
0x7fffea1c8590	movsxd r9,ebx
0x7fffea1c8593	add    esi,0x3
0x7fffea1c8596	movsxd rcx,esi
0x7fffea1c8599	cmp    edi,0x10
0x7fffea1c859c	movss  xmm0,DWORD PTR [r10+rcx*4]
0x7fffea1c85a2	movss  DWORD PTR [rsp+rbp*4+0x14],xmm0
0x7fffea1c85a8	movss  xmm0,DWORD PTR [r10+r9*4]
0x7fffea1c85ae	movss  DWORD PTR [rsp+rbp*4+0x18],xmm0
0x7fffea1c85b4	movss  xmm0,DWORD PTR [r10+r13*4]
0x7fffea1c85ba	movss  DWORD PTR [rsp+rbp*4+0x1c],xmm0
0x7fffea1c85c0	movss  xmm0,DWORD PTR [r10+rax*4]
0x7fffea1c85c6	movss  DWORD PTR [rsp+rbp*4+0x20],xmm0
0x7fffea1c85cc	movss  xmm0,DWORD PTR [r10+r14*4]
0x7fffea1c85d2	movss  DWORD PTR [rsp+rbp*4+0x24],xmm0
0x7fffea1c85d8	mov    rbp,rdi
0x7fffea1c85db	jl     0x7fffea1c8540 <__OpenCL_vertexShader_stub+384>
0x7fffea1c85e1	movss  xmm0,DWORD PTR [rsp+0x18]
0x7fffea1c85e7	mulss  xmm0,xmm7
0x7fffea1c85eb	addss  xmm0,xmm8
0x7fffea1c85f0	movss  xmm1,DWORD PTR [rsp+0x1c]
0x7fffea1c85f6	mulss  xmm1,xmm6
0x7fffea1c85fa	addss  xmm1,xmm0
0x7fffea1c85fe	movss  xmm2,DWORD PTR [rsp+0x20]
0x7fffea1c8604	mulss  xmm2,xmm5
0x7fffea1c8608	movss  xmm0,DWORD PTR [rsp+0x8]
0x7fffea1c860e	mulss  xmm0,xmm7
0x7fffea1c8612	addss  xmm0,xmm8
0x7fffea1c8617	addss  xmm2,xmm1
0x7fffea1c861b	movss  xmm10,DWORD PTR [rsp+0x24]
0x7fffea1c8622	mulss  xmm10,xmm9
0x7fffea1c8627	addss  xmm10,xmm2
0x7fffea1c862c	movss  xmm2,DWORD PTR [rsp+0xc]
0x7fffea1c8632	mulss  xmm2,xmm6
0x7fffea1c8636	movss  xmm1,DWORD PTR [r15+0x4]
0x7fffea1c863c	mulss  xmm1,xmm10
0x7fffea1c8641	addss  xmm2,xmm0
0x7fffea1c8645	movss  xmm0,DWORD PTR [rsp+0x10]
0x7fffea1c864b	mulss  xmm0,xmm5
0x7fffea1c864f	addss  xmm0,xmm2
0x7fffea1c8653	movss  xmm4,DWORD PTR [rsp+0x14]
0x7fffea1c8659	mulss  xmm4,xmm9
0x7fffea1c865e	addss  xmm4,xmm0
0x7fffea1c8662	movss  xmm0,DWORD PTR [r15]
0x7fffea1c8667	mulss  xmm0,xmm4
0x7fffea1c866b	addss  xmm0,xmm8
0x7fffea1c8670	movss  xmm2,DWORD PTR [rsp+0x28]
0x7fffea1c8676	mulss  xmm2,xmm7
0x7fffea1c867a	addss  xmm0,xmm1
0x7fffea1c867e	addss  xmm2,xmm8
0x7fffea1c8683	mulss  xmm7,DWORD PTR [rsp+0x38]
0x7fffea1c8689	addss  xmm7,xmm8
0x7fffea1c868e	movss  xmm1,DWORD PTR [rsp+0x2c]
0x7fffea1c8694	mulss  xmm1,xmm6
0x7fffea1c8698	mulss  xmm6,DWORD PTR [rsp+0x3c]
0x7fffea1c869e	addss  xmm1,xmm2
0x7fffea1c86a2	movss  xmm3,DWORD PTR [rsp+0x30]
0x7fffea1c86a8	mulss  xmm3,xmm5
0x7fffea1c86ac	addss  xmm3,xmm1
0x7fffea1c86b0	movss  xmm2,DWORD PTR [rsp+0x34]
0x7fffea1c86b6	mulss  xmm2,xmm9
0x7fffea1c86bb	addss  xmm2,xmm3
0x7fffea1c86bf	movss  xmm1,DWORD PTR [r15+0x8]
0x7fffea1c86c5	mulss  xmm1,xmm2
0x7fffea1c86c9	addss  xmm1,xmm0
0x7fffea1c86cd	addss  xmm6,xmm7
0x7fffea1c86d1	mulss  xmm5,DWORD PTR [rsp+0x40]
0x7fffea1c86d7	addss  xmm5,xmm6
0x7fffea1c86db	mulss  xmm9,DWORD PTR [rsp+0x44]
0x7fffea1c86e2	addss  xmm9,xmm5
0x7fffea1c86e7	movss  xmm5,DWORD PTR [r15+0xc]
0x7fffea1c86ed	mulss  xmm5,xmm9
0x7fffea1c86f2	addss  xmm5,xmm1
0x7fffea1c86f6	movss  xmm3,DWORD PTR [r12]
0x7fffea1c86fc	mulss  xmm3,xmm5
0x7fffea1c8700	movss  xmm0,DWORD PTR [r12+0x30]
0x7fffea1c8707	mulss  xmm0,xmm5
0x7fffea1c870b	movss  xmm1,DWORD PTR [r15+0x10]
0x7fffea1c8711	mulss  xmm1,xmm4
0x7fffea1c8715	addss  xmm0,xmm8
0x7fffea1c871a	addss  xmm3,xmm8
0x7fffea1c871f	addss  xmm1,xmm8
0x7fffea1c8724	movss  xmm6,DWORD PTR [r15+0x14]
0x7fffea1c872a	mulss  xmm6,xmm10
0x7fffea1c872f	addss  xmm6,xmm1
0x7fffea1c8733	movss  xmm1,DWORD PTR [r15+0x18]
0x7fffea1c8739	mulss  xmm1,xmm2
0x7fffea1c873d	addss  xmm1,xmm6
0x7fffea1c8741	movss  xmm11,DWORD PTR [r15+0x1c]
0x7fffea1c8747	mulss  xmm11,xmm9
0x7fffea1c874c	addss  xmm11,xmm1
0x7fffea1c8751	movss  xmm1,DWORD PTR [r12+0x4]
0x7fffea1c8758	mulss  xmm1,xmm11
0x7fffea1c875d	addss  xmm1,xmm3
0x7fffea1c8761	movss  xmm3,DWORD PTR [r12+0x34]
0x7fffea1c8768	mulss  xmm3,xmm11
0x7fffea1c876d	addss  xmm3,xmm0
0x7fffea1c8771	movss  xmm0,DWORD PTR [r15+0x20]
0x7fffea1c8777	mulss  xmm0,xmm4
0x7fffea1c877b	addss  xmm0,xmm8
0x7fffea1c8780	movss  xmm6,DWORD PTR [r15+0x24]
0x7fffea1c8786	mulss  xmm6,xmm10
0x7fffea1c878b	addss  xmm6,xmm0
0x7fffea1c878f	movss  xmm0,DWORD PTR [r15+0x28]
0x7fffea1c8795	mulss  xmm0,xmm2
0x7fffea1c8799	addss  xmm0,xmm6
0x7fffea1c879d	movss  xmm7,DWORD PTR [r15+0x2c]
0x7fffea1c87a3	mulss  xmm7,xmm9
0x7fffea1c87a8	addss  xmm7,xmm0
0x7fffea1c87ac	movss  xmm0,DWORD PTR [r12+0x38]
0x7fffea1c87b3	mulss  xmm0,xmm7
0x7fffea1c87b7	addss  xmm0,xmm3
0x7fffea1c87bb	movss  xmm3,DWORD PTR [r12+0x8]
0x7fffea1c87c2	mulss  xmm3,xmm7
0x7fffea1c87c6	addss  xmm3,xmm1
0x7fffea1c87ca	mov    rbp,QWORD PTR [rsp-0x38]
0x7fffea1c87cf	mov    rsi,QWORD PTR [rsp-0x10]
0x7fffea1c87d4	mov    eax,DWORD PTR [rbp+rsi*4+0x0]
0x7fffea1c87d8	mov    rdi,QWORD PTR [rsp-0x8]
0x7fffea1c87dd	add    eax,edi
0x7fffea1c87df	mov    rbx,QWORD PTR [rsp-0x40]
0x7fffea1c87e4	mov    r8d,DWORD PTR [rbx+rax*4]
0x7fffea1c87e8	mov    r9,QWORD PTR [rsp-0x30]
0x7fffea1c87ed	add    r8d,DWORD PTR [r9+rsi*4]
0x7fffea1c87f1	mulss  xmm10,DWORD PTR [r15+0x34]
0x7fffea1c87f7	mulss  xmm4,DWORD PTR [r15+0x30]
0x7fffea1c87fd	addss  xmm4,xmm8
0x7fffea1c8802	addss  xmm4,xmm10
0x7fffea1c8807	movss  xmm6,DWORD PTR [r12+0x20]
0x7fffea1c880e	mulss  xmm6,xmm5
0x7fffea1c8812	addss  xmm6,xmm8
0x7fffea1c8817	mulss  xmm9,DWORD PTR [r15+0x3c]
0x7fffea1c881d	mulss  xmm2,DWORD PTR [r15+0x38]
0x7fffea1c8823	addss  xmm2,xmm4
0x7fffea1c8827	movss  xmm10,DWORD PTR [rip+0x640]        # 0x7fffea1c8e70
0x7fffea1c8830	addss  xmm2,xmm9
0x7fffea1c8835	movss  xmm1,DWORD PTR [r12+0xc]
0x7fffea1c883c	mulss  xmm1,xmm2
0x7fffea1c8840	addss  xmm1,xmm3
0x7fffea1c8844	movss  xmm3,DWORD PTR [r12+0x3c]
0x7fffea1c884b	mulss  xmm3,xmm2
0x7fffea1c884f	addss  xmm3,xmm0
0x7fffea1c8853	movss  xmm0,DWORD PTR [r12+0x24]
0x7fffea1c885a	mulss  xmm0,xmm11
0x7fffea1c885f	addss  xmm0,xmm6
0x7fffea1c8863	movss  xmm6,DWORD PTR [r12+0x28]
0x7fffea1c886a	mulss  xmm6,xmm7
0x7fffea1c886e	addss  xmm6,xmm0
0x7fffea1c8872	movss  xmm4,DWORD PTR [r12+0x2c]
0x7fffea1c8879	mulss  xmm4,xmm2
0x7fffea1c887d	addss  xmm4,xmm6
0x7fffea1c8881	mulss  xmm11,DWORD PTR [r12+0x14]
0x7fffea1c8888	mulss  xmm5,DWORD PTR [r12+0x10]
0x7fffea1c888f	addss  xmm5,xmm8
0x7fffea1c8894	divss  xmm4,xmm3
0x7fffea1c8898	divss  xmm1,xmm3
0x7fffea1c889c	addss  xmm5,xmm11
0x7fffea1c88a1	mulss  xmm7,DWORD PTR [r12+0x18]
0x7fffea1c88a8	addss  xmm7,xmm5
0x7fffea1c88ac	mulss  xmm2,DWORD PTR [r12+0x1c]
0x7fffea1c88b3	addss  xmm2,xmm7
0x7fffea1c88b7	divss  xmm2,xmm3
0x7fffea1c88bb	divss  xmm10,xmm3
0x7fffea1c88c0	movss  xmm0,DWORD PTR [r11+0x4]
0x7fffea1c88c6	mulss  xmm0,xmm2
0x7fffea1c88ca	movss  xmm3,DWORD PTR [r11]
0x7fffea1c88cf	mulss  xmm3,xmm1
0x7fffea1c88d3	addss  xmm3,xmm8
0x7fffea1c88d8	addss  xmm3,xmm0
0x7fffea1c88dc	movss  xmm0,DWORD PTR [r11+0x8]
0x7fffea1c88e2	mulss  xmm0,xmm4
0x7fffea1c88e6	addss  xmm0,xmm3
0x7fffea1c88ea	movss  xmm3,DWORD PTR [r11+0x14]
0x7fffea1c88f0	mulss  xmm3,xmm2
0x7fffea1c88f4	movss  xmm5,DWORD PTR [r11+0x10]
0x7fffea1c88fa	mulss  xmm5,xmm1
0x7fffea1c88fe	addss  xmm5,xmm8
0x7fffea1c8903	movss  xmm7,DWORD PTR [r11+0xc]
0x7fffea1c8909	mulss  xmm7,xmm10
0x7fffea1c890e	addss  xmm5,xmm3
0x7fffea1c8912	addss  xmm7,xmm0
0x7fffea1c8916	movss  xmm9,DWORD PTR [r11+0x38]
0x7fffea1c891c	mulss  xmm9,xmm4
0x7fffea1c8921	movss  xmm3,DWORD PTR [r11+0x28]
0x7fffea1c8927	mulss  xmm3,xmm4
0x7fffea1c892b	mulss  xmm4,DWORD PTR [r11+0x18]
0x7fffea1c8931	movss  xmm11,DWORD PTR [r11+0x34]
0x7fffea1c8937	mulss  xmm11,xmm2
0x7fffea1c893c	mulss  xmm2,DWORD PTR [r11+0x24]
0x7fffea1c8942	movss  xmm6,DWORD PTR [r11+0x30]
0x7fffea1c8948	mulss  xmm6,xmm1
0x7fffea1c894c	mulss  xmm1,DWORD PTR [r11+0x20]
0x7fffea1c8952	movss  xmm12,DWORD PTR [r11+0x3c]
0x7fffea1c8958	mulss  xmm12,xmm10
0x7fffea1c895d	movss  xmm0,DWORD PTR [r11+0x2c]
0x7fffea1c8963	mulss  xmm0,xmm10
0x7fffea1c8968	mulss  xmm10,DWORD PTR [r11+0x1c]
0x7fffea1c896e	mov    rax,QWORD PTR [rsp-0x60]
0x7fffea1c8973	movss  DWORD PTR [rax+r8*4],xmm7
0x7fffea1c8979	addss  xmm1,xmm8
0x7fffea1c897e	addss  xmm1,xmm2
0x7fffea1c8982	addss  xmm4,xmm5
0x7fffea1c8986	mov    ecx,DWORD PTR [rsp+0x4]
0x7fffea1c898a	inc    ecx
0x7fffea1c898c	addss  xmm4,xmm10
0x7fffea1c8991	addss  xmm1,xmm3
0x7fffea1c8995	mov    eax,DWORD PTR [rbp+rsi*4+0x0]
0x7fffea1c8999	add    eax,edi
0x7fffea1c899b	mov    eax,DWORD PTR [rbx+rax*4]
0x7fffea1c899e	add    eax,DWORD PTR [r9+rsi*4]
0x7fffea1c89a2	addss  xmm1,xmm0
0x7fffea1c89a6	mov    rdx,QWORD PTR [rsp-0x58]
0x7fffea1c89ab	movss  DWORD PTR [rdx+rax*4],xmm4
0x7fffea1c89b0	mov    eax,DWORD PTR [rbp+rsi*4+0x0]
0x7fffea1c89b4	add    eax,edi
0x7fffea1c89b6	mov    eax,DWORD PTR [rbx+rax*4]
0x7fffea1c89b9	add    eax,DWORD PTR [r9+rsi*4]
0x7fffea1c89bd	mov    rdx,QWORD PTR [rsp-0x50]
0x7fffea1c89c2	movss  DWORD PTR [rdx+rax*4],xmm1
0x7fffea1c89c7	add    edi,DWORD PTR [rbp+rsi*4+0x0]
0x7fffea1c89cb	mov    eax,DWORD PTR [rbx+rdi*4]
0x7fffea1c89ce	add    eax,DWORD PTR [r9+rsi*4]
0x7fffea1c89d2	cmp    ecx,DWORD PTR [rsp-0x14]
0x7fffea1c89d6	addss  xmm6,xmm8
0x7fffea1c89db	addss  xmm6,xmm11
0x7fffea1c89e0	addss  xmm6,xmm9
0x7fffea1c89e5	addss  xmm6,xmm12
0x7fffea1c89ea	mov    rdx,QWORD PTR [rsp-0x48]
0x7fffea1c89ef	movss  DWORD PTR [rdx+rax*4],xmm6
0x7fffea1c89f4	mov    rdx,QWORD PTR [rsp-0x20]
0x7fffea1c89f9	jb     0x7fffea1c8470 <__OpenCL_vertexShader_stub+176>
0x7fffea1c89ff	add    rsp,0x48
0x7fffea1c8a03	pop    rbx
0x7fffea1c8a04	pop    r12
0x7fffea1c8a06	pop    r13
0x7fffea1c8a08	pop    r14
0x7fffea1c8a0a	pop    r15
0x7fffea1c8a0c	pop    rbp
0x7fffea1c8a0d	ret

Pourtant la taille du buffer d'entrée et celui du buffer de sortie est la même, j'ai imprimer toutes les positions des sommets dans le code c++ et je n'ai remarqué aucun débordement de de tableau avec les indices.

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
 
VertexArray va = m_instances[i]->getVertexArray();
                for (unsigned int j = 0; j < m_instances[i]->getVertexArray().getVertexCount(); j++) {
                    int instanceID = j / nbVerticesPerFace[0];
                    int offset = j % nbVerticesPerFace[0];
                    std::cout<<m_instances[i]->getVertexArray()[baseVertices[instanceID]+indexes[baseIndexes[instanceID]+offset]].position.x<<std::endl;
                }
//OK!

Je ne comprend pas pourquoi ça plante, le plantage provient peut être au niveau de l'api même.

Voici le lien ou j'ai télécharger l'api :

http://developer.amd.com/tools-and-s...ssing-app-sdk/

Peut être devrais je me plonger dans le code source de openCL...

**ternel** · 24/06/2015, 14h38

N'y connaissant rien en OpenCL, je t'invite à vérifier que tu ne retourne pas un tableau local.
C'est hautement probable, et parfaitement faux.

C'est pour ca qu'en l'absence d'allocation dynamique, il faut aussi prendre le pointeur de sortie en argument (avec les tailles nécessaires, bien sûr)

penses à la forme de getline(istream&, string&)

Invité · 24/06/2015, 19h17

Re, je prend également le pointeur de sortie en argument du kernel afin de récupérer les résultats mais le soucis c'est que malgré cela il crash. :/

En plus quand il ne crash pas je n'ai pas du tout les même résultats que lorsque j'exécute tout au niveau du CPU pour les transformations des sommets par exemple, ils ne me transforme rien du tout. :/

Je pense que cette technologie n'est pas encore au point...

PS : je remets le code :

Code cpp :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
 
CPURenderComponent::CPURenderComponent(math::Vec3f position,math::Vec3f size, math::Vec3f origin, bool useThread, RenderWindow& rw)
        : Component(position,size, origin, useThread), window(rw), view (rw.getView()) {
            redBuffer.resize(size.x * size.y);
            blueBuffer.resize(size.x * size.y);
            greenBuffer.resize(size.x * size.y);
            alphaBuffer.resize(size.x * size.y);
            depthBuffer.resize(size.x * size.y);
            this->size.x = rw.getSize().x;
            this->size.y = rw.getSize().y;
            view = window.getView();
            fbTile = new Tile(&fbTexture, position, size, sf::IntRect(0, 0, size.x, size.y));
            fbTile->setCenter(view.getPosition());
            cl_int err;
            std::vector< cl::Platform > platformList;
            cl::Platform::get(&platformList);
            checkErr(platformList.size()!=0 ? CL_SUCCESS : -1, "cl::Platform::get");
            std::cerr << "Platform number is: " << platformList.size() << std::endl;std::string platformVendor;
            platformList[0].getInfo((cl_platform_info)CL_PLATFORM_VENDOR, &platformVendor);
            std::cerr << "Platform is by: " << platformVendor << "\n";
            cl_context_properties cprops[3] = {CL_CONTEXT_PLATFORM, (cl_context_properties)(platformList[0])(), 0};
            clcontext = cl::Context (
              CL_DEVICE_TYPE_ALL,
              cprops,
              NULL,
              NULL,
              &err);
            checkErr(err, "Context::Context()");           
            devices = clcontext.getInfo<CL_CONTEXT_DEVICES>();
            checkErr(devices.size() > 0 ? CL_SUCCESS : -1, "devices.size() > 0");
            std::string prog = "#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable\n"
                               "const int stepXSize = 4;\n"
                               "const int stepYSize = 4;\n"
                               "void multMat (float* matA, float* matB, float* matC, const int n) {\n"
                                    "for (int x = 0; x < n; ++x) {\n"
                                        "for (int y = 0; y < n; ++y) {\n"
                                            "float value = 0;\n"
                                            "for (int k = 0; k < n; ++k) {\n"
                                                "float elementA = matA[y * n + k];\n"
                                                "float elementB = matB[k * n + x];\n"
                                                "value += elementA * elementB;\n"
                                            "}\n"
                                            "matC[y * n + x] = value;\n"
                                        "}\n"
                                    "}\n"
                                "}\n"
                                "void addVec (float* vecA, float* vecB, float* vecC, const int n) {\n"
                                    "for (int i = 0; i < n; i++) {\n"
                                        "vecC[i] = vecA[i] + vecB[i];\n"
                                    "}\n"
                                "}\n"
                                "void multVec (float* vecA, float* vecB, float* vecC, const int n) {\n"
                                    "for (int i = 0; i < n; i++) {\n"
                                        "vecC[i] = vecA[i] * vecB[i];\n"
                                    "}\n"
                                "}\n"
                                "void multMatVec (float* matA, float* vecB, float* vecC, const int n) {\n"
                                   "for (int i = 0; i < n; ++i) {\n"
                                        "float value = 0;\n"
                                        "for (int j = 0; j < n; ++j) {\n"
                                            "value += vecB[j] * matA[i*4+j];\n"
                                        "}\n"
                                        "vecC[i] = value;\n"
                                   "}\n"
                                "}\n"
                                "void transpose(float* matA, float* matB, const int n) {\n"
                                    "for (int i = 0; i < n; ++i) {\n"
                                        "for (int j = 0; j < n; ++j) {\n"
                                            "matB[i*n+j] = matA[j*n+i];\n"
                                        "}\n"
                                    "}\n"
                                "}\n"
                                "float  det (float* mat, const int n) {\n"
                                    "int d=0, p, h, k, i, j;\n"
                                    "const int m = n *n;\n"
                                    "float temp[16];\n"
                                    "if(n==1) {\n"
                                        "return mat[0];\n"
                                    "} else if(n==2) {\n"
                                        "d=(mat[0]*mat[3]-mat[1]*mat[2]);\n"
                                        "return d;\n"
                                    "} else {\n"
                                        "for(p=0;p<n;p++) {\n"
                                            "h = 0;\n"
                                            "k = 0;\n"
                                            "for(i=1;i<n;i++) {\n"
                                                "for( j=0;j<n;j++) {\n"
                                                    "if(j!=p) {\n"
                                                        "temp[h*n+k] = mat[i*n+j];\n"
                                                        "k++;\n"
                                                        "if(k==n-1) {\n"
                                                            "h++;\n"
                                                            "k = 0;\n"
                                                        "}\n"
                                                    "}\n"
                                                "}\n"
                                            "}\n"
                                            "d=d+mat[p]*pow((float) -1,(float) p)*det(temp,n-1);\n"
                                        "}\n"
                                        "return d;\n"
                                    "}\n"
                                "}\n"                               
                                "int equal(float* v1, float* v2, const int n) {\n"
                                    "for (int i = 0; i < n; i++) {\n"
                                        "if(v1[i] != v2[i]) {\n"
                                            "return 0;\n"
                                        "}\n"
                                    "}\n"
                                    "return 1;\n"
                                "}\n"                               
                                "__kernel void vertexShader(__global float* vPosX, __global float* vPosY, __global float* vPosZ, __global float* vPosW,\n"
                                                           "__global float* outvPosX, __global float* outvPosY, __global float* outvPosZ, __global float* outvPosW,"
                                                           "__global unsigned int* indices,  __global unsigned int* numIndices, __global unsigned int* baseIndices,\n"
                                                           "__global unsigned int* baseVertices,  __global unsigned int* nbVerticesPerFace,\n"
                                                           "__global float* transfMatrices, __global float* projMatrix, __global float* viewMatrix,\n"
                                                           "__global float* viewportMatrix) {\n"
                                    "size_t tid = get_global_id(0);\n"
                                    "int instanceID = tid / nbVerticesPerFace[0];\n"
                                    "int offset = tid % nbVerticesPerFace[0];\n"
                                    "float transfMatrix[16];\n"
                                    "float position[4] = {vPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]],\n"
                                    "vPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]], vPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]]};\n"
                                    "for (int i = 0; i < 16; i++) {\n"
                                        "transfMatrix[i] = transfMatrices[instanceID*16+i];\n"
                                    "}\n"
                                    "float worldcoords[4];\n;"
                                    "multMatVec(transfMatrix, position, worldcoords, 4);\n"
                                    "float viewcoords[4];\n"
                                    "multMatVec(viewMatrix, worldcoords, viewcoords, 4);\n"
                                    "float clipcoords[4];\n"
                                    "multMatVec(projMatrix, viewcoords, clipcoords, 4);\n"
                                    "float ndcCoords[] = {clipcoords[0] / clipcoords[3], clipcoords[1] / clipcoords[3], clipcoords[2] / clipcoords[3], 1 / clipcoords[3]};\n"
                                    "float finalPos[4];"
                                    "multMatVec(viewportMatrix, ndcCoords, finalPos, 4);\n"
                                    "outvPosX[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[0];\n"
                                    "outvPosY[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[1];\n"
                                    "outvPosZ[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[2];\n"
                                    "outvPosW[baseVertices[instanceID]+indices[baseIndices[instanceID]+offset]] = finalPos[3];\n"
                                "}\n";                               
            cl::Program::Sources source(
                1,
                std::make_pair(prog.c_str(), prog.length()+1));
            cl::Program program(clcontext, source);
            err = program.build(devices,"",&checkErr);
            if (err != CL_SUCCESS) {
                std::string programLog;
                program.getBuildInfo(devices[0],CL_PROGRAM_BUILD_LOG, &programLog);
                std::ofstream file("buildLog.log");
                file<<programLog;
                file.close();
            }
            checkErr(err, "Program::build()");
            clkvertexShader = cl::Kernel(program, "vertexShader", &err);
            checkErr(err, "Kernel::Kernel()");            
            View& view = window.getView();
            ViewportMatrix vpm;
            vpm.setViewport(math::Vec3f(view.getViewport().getPosition().x, view.getViewport().getPosition().y, 0),
            math::Vec3f(view.getViewport().getWidth(), view.getViewport().getHeight(), 1));
            std::array<float, 16> projMatrix = view.getProjMatrix().getMatrix().toGlMatrix();
            std::array<float, 16> viewMatrix = view.getViewMatrix().getMatrix().toGlMatrix();
            std::array<float, 16> viewportMatrix = vpm.getMatrix().toGlMatrix();
            cprojMatrixBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, projMatrix.size(), projMatrix.data(), &err);
            checkErr(err, "Buffer::Buffer()");
            cviewMatrixBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, viewMatrix.size(), viewMatrix.data(), &err);
            checkErr(err, "Buffer::Buffer()");
            cvpMatrixBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, viewportMatrix.size(), viewportMatrix.data(), &err);
            checkErr(err, "Buffer::Buffer()");
            err = clkvertexShader.setArg(14, cprojMatrixBuffer);
            checkErr(err, "Kernel::setArg()");
            err = clkvertexShader.setArg(15, cviewMatrixBuffer);
            checkErr(err, "Kernel::setArg()");
            err = clkvertexShader.setArg(16, cvpMatrixBuffer);
            checkErr(err, "Kernel::setArg()");
        }
        void CPURenderComponent::loadEntitiesOnComponent(std::vector<Entity*> vEntities) {
             if (Shader::isAvailable()) {
                batcher.clear();
                for (unsigned int i = 0; i < vEntities.size(); i++) {
                    if ( vEntities[i]->isLeaf()) {
                        for (unsigned int j = 0; j <  vEntities[i]->getFaces().size(); j++) {
                             batcher.addFace( vEntities[i]->getFaces()[j]);
                        }
                    }
                }
                m_instances = batcher.getInstances();
            }
            this->visibleEntities = vEntities;
        }
        void CPURenderComponent::drawNextFrame() {
            for (unsigned int i = 0; i < m_instances.size(); i++) {
                std::vector<float> transformMatrices;
                for (unsigned int j = 0; j < m_instances[i]->getTransforms().size(); j++) {
                    float* tmatrix = m_instances[i]->getTransforms()[j].get().getGlMatrix();
                    for (unsigned int n = 0; n < 16; n++) {
                        transformMatrices.push_back(tmatrix[n]);
                    }
                }
 
                cl_int err;
                ctransfMatrixBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, transformMatrices.size(), transformMatrices.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                math::Matrix4f texM = (m_instances[i]->getMaterial().getTexture() != nullptr) ?
                    m_instances[i]->getMaterial().getTexture()->getTextureMatrix() :
                    math::Matrix4f();
                std::vector<float> vPosX = m_instances[i]->getVertexArray().m_vPosX;
                std::vector<float> vPosY = m_instances[i]->getVertexArray().m_vPosY;
                std::vector<float> vPosZ = m_instances[i]->getVertexArray().m_vPosZ;
                std::vector<float> vPosW = m_instances[i]->getVertexArray().m_vPosW;
                std::vector<unsigned char> vcRed = m_instances[i]->getVertexArray().m_vcRed;
                std::vector<unsigned char> vcBlue = m_instances[i]->getVertexArray().m_vcBlue;
                std::vector<unsigned char> vcGreen = m_instances[i]->getVertexArray().m_vcGreen;
                std::vector<unsigned char> vcAlpha = m_instances[i]->getVertexArray().m_vcAlpha;
                std::vector<unsigned int> ctx = m_instances[i]->getVertexArray().m_ctX;
                std::vector<unsigned int> cty = m_instances[i]->getVertexArray().m_ctY;
                std::vector<unsigned int> indexes = m_instances[i]->getVertexArray().m_indexes;
                std::vector<unsigned int> numIndexes = m_instances[i]->getVertexArray().m_numIndexes;
                std::vector<unsigned int> baseIndexes = m_instances[i]->getVertexArray().m_baseIndexes;
                std::vector<unsigned int> baseVertices = m_instances[i]->getVertexArray().m_baseVertices;
                std::vector<unsigned int> nbVerticesPerFace(3);
                std::array<float, 16> texMatrix = texM.toGlMatrix();
                nbVerticesPerFace[0] = m_instances[i]->getVertexArray().nbVerticesPerFace;
                nbVerticesPerFace[1] = m_instances[i]->getVertexArray().isLoop();
                nbVerticesPerFace[2] = (m_instances[i]->getVertexArray().isLoop()) ? m_instances[i]->getVertexArray().nbVerticesPerFace * 2 : m_instances[i]->getVertexArray().nbVerticesPerFace * 2 - 1;
                VertexArray va = m_instances[i]->getVertexArray();               
                cvposXBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, vPosX.size(), vPosX.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cvposYBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, vPosY.size(), vPosY.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cvposZBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, vPosZ.size(), vPosZ.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cvposWBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, vPosW.size(), vPosW.data(), &err);
                checkErr(err, "Buffer::Buffer()");                
                cvindexesBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, indexes.size(), indexes.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cnumIndexesBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, numIndexes.size(), numIndexes.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cbaseIndexesBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, baseIndexes.size(), baseIndexes.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cbaseVerticesBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, baseVertices.size(), baseVertices.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cNbVerticesPerFaces = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, nbVerticesPerFace.size(), nbVerticesPerFace.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                ctexMatrixBuffer = cl::Buffer(clcontext, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, texMatrix.size(), texMatrix.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                cl_int workgroupSize = m_instances[i]->getVertexArray().getIndexes().size();
                std::vector<float> vvPosX (workgroupSize);
                std::vector<float> vvPosY (workgroupSize);
                std::vector<float> vvPosZ (workgroupSize);
                std::vector<float> vvPosW (workgroupSize);
                vcvposXBuffer = cl::Buffer(clcontext, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, vvPosX.size(), vvPosX.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                vcvposYBuffer = cl::Buffer(clcontext, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, vvPosY.size(), vvPosY.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                vcvposZBuffer = cl::Buffer(clcontext, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, vvPosZ.size(), vvPosZ.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                vcvposWBuffer = cl::Buffer(clcontext, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, vvPosW.size(), vvPosW.data(), &err);
                checkErr(err, "Buffer::Buffer()");
                err = clkvertexShader.setArg(0, cvposXBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(1, cvposYBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(2, cvposZBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(3, cvposWBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(4, vcvposXBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(5, vcvposYBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(6, vcvposZBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(7, vcvposWBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(8, cvindexesBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(9, cnumIndexesBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(10, cbaseIndexesBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(11, cbaseVerticesBuffer);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(12, cNbVerticesPerFaces);
                checkErr(err, "Kernel::setArg()");
                err = clkvertexShader.setArg(13, ctransfMatrixBuffer);
                checkErr(err, "Kernel::setArg()");
                cl::CommandQueue clqueue (clcontext, devices[0], 0, &err);
                checkErr(err, "CommandQueue::CommandQueue()");
                cl::Event event;
                err = clqueue.enqueueNDRangeKernel(clkvertexShader,
                                                    1, workgroupSize,
                                                    1, nullptr, &event);
                checkErr(err, "ComamndQueue::enqueueNDRangeKernel()");
                event.wait();
                err = clqueue.enqueueReadBuffer(vcvposXBuffer,CL_TRUE,0,workgroupSize,vvPosX.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(vcvposYBuffer,CL_TRUE,0,workgroupSize,vvPosY.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(vcvposZBuffer,CL_TRUE,0,workgroupSize,vvPosZ.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                err = clqueue.enqueueReadBuffer(vcvposWBuffer,CL_TRUE,0,workgroupSize,vvPosW.data());
                checkErr(err, "ComamndQueue::enqueueReadBuffer()");
                std::cout<<"vertex shader : "<<std::endl;
                for (unsigned int i = 0; i < workgroupSize; i++)
                    std::cout<<vPosX[i]<<" "<<vPosY[i]<<" "<<vPosZ[i]<<" "<<vPosW[i]<<std::endl;
     }

**koala01** · 24/06/2015, 19h56

Envoyé par Lolilolight

Je pense que cette technologie n'est pas encore au point...

Sauf erreur de ma part, il me semble qu'OpenCL est une technologie stable et éprouvée... A priori (vu que tu débutes dans cette optique et que je n'en connais pas plus que toi), j'aurais donc tendance à estimer qu'il y a beaucoup plus de chances pour que l'origine du problème se trouve entre ton clavier et ta chaise

.

Bien sur, je peux me tromper évidemment !!! Mais il faut avouer que, avant d'aller casser du sucre sur le dos d'une équipe de développeurs, il est souvent utile de s'assurer que l'on en utilise l'outil exactement de la manière préconisée, ne crois tu pas

De manière générale, peut être essayes tu "simplement" d'aller beaucoup trop vite en besogne au risque de passer à coté d'une notion capitale

As-tu pris la peine de lire la doc de manière un peu plus approfondie qu'en la survolant

(si oui, je te fais toutes mes excuses anticipées

)

Invité · 24/06/2015, 20h21

Oui j'ai lu la doc. (Pas survolée)

Je ne comprend absolument pas pourquoi ça crash. (Parfois ça crash, parfois pas...)

Bref, ça me semble assez hasardeux comme technologie. :/

A moins qu'il y a peut être une erreur dans mon code.

Mais à part envoyer des pointeurs sur des tableaux, et utiliser ces tableaux dans des fonctions, je ne fais rien qui pourrait causer un crash pareil.

**skeud** · 25/06/2015, 09h17

en général, un crash aléatoire induit une variable non initialisé, vérifie bien que toute tes variables sont bien initialisé.

Pour avoir utilisé opencl sur de la reconnaissance de forme, je te garanti que la lib est stable et fonctionnelle. Le problème vient donc de ton code et non pas des autres

.

Invité · 25/06/2015, 19h38

Mince j`ai un probleme avec mon disque dur maintenant, il ne veut plus m`installer aucun OS...

Je dois avoir oublie d`initialiser quelque chose alors....

[OpenCL]Erreur de compilation de kernels. (Code similaire au C99)

Langage C++

Discussions similaires

Partager

Partager