gpu - Multiple processes launching CUDA kernels in parallel -
gpu - Multiple processes launching CUDA kernels in parallel -
i know nvidia gpus compute capability 2.x or greater can execute u pto 16 kernels concurrently. however, application spawns 7 "processes" , each of these 7 processes launch cuda kernels.
my first question expected behavior of these kernels. execute concurrently or, since launched different processes, execute sequentially.
i confused because cuda c programming guide says:
"a kernel 1 cuda context cannot execute concurrently kernel cuda context." brings me sec question, cuda "contexts"?
thanks!
a cuda context virtual execution space holds code , info owned host thread or process. 1 context can ever active on gpu current hardware.
so reply first question, if have 7 separate threads or processes trying found context , run on same gpu simultaneously, serialised , process waiting access gpu blocked until owner of running context yields. there is, best of knowledge, no time slicing , scheduling heuristics not documented , (i suspect) not uniform operating scheme operating system.
you improve launch single worker thread holding gpu context , utilize messaging other threads force work onto gpu. alternatively there context migration facility available in cuda driver api, work threads same process, , migration mechanism has latency , host cpu overhead.
cuda gpu
Comments
Post a Comment