cuda - Which threads in a block form a warp? -
cuda - Which threads in a block form a warp? -
in 2-d or 3-d cuda block, how threads grouped warps? assumption iterate first x, y, z. example, in threads <z,y,x>
, <0,0,[0-31]>
warp, , <0,1,[0-31]>
, etc. correct?
yes correct. threads grouped first x, y, z (thread coordinates) when creating warps (groups of 32 threads execute together). has implications coalescing: want arrange usage of thread coordinates in matrix subscripts warp-adjacent threads (i.e. in x coordinates, typically) access adjacent elements in matrix (by using threadidx.x or derivative in rapidly varying matrix dimension. typically want data[z][y][x]
, not data[x][y][z]
cuda gpu
Comments
Post a Comment