gpu - Multiple processes launching CUDA kernels in parallel -



gpu - Multiple processes launching CUDA kernels in parallel -

i know nvidia gpus compute capability 2.x or greater can execute u pto 16 kernels concurrently. however, application spawns 7 "processes" , each of these 7 processes launch cuda kernels.

my first question expected behavior of these kernels. execute concurrently or, since launched different processes, execute sequentially.

i confused because cuda c programming guide says:

"a kernel 1 cuda context cannot execute concurrently kernel cuda context." brings me sec question, cuda "contexts"?

thanks!

a cuda context virtual execution space holds code , info owned host thread or process. 1 context can ever active on gpu current hardware.

so reply first question, if have 7 separate threads or processes trying found context , run on same gpu simultaneously, serialised , process waiting access gpu blocked until owner of running context yields. there is, best of knowledge, no time slicing , scheduling heuristics not documented , (i suspect) not uniform operating scheme operating system.

you improve launch single worker thread holding gpu context , utilize messaging other threads force work onto gpu. alternatively there context migration facility available in cuda driver api, work threads same process, , migration mechanism has latency , host cpu overhead.

cuda gpu

Comments

Popular posts from this blog

web services - java.lang.NoClassDefFoundError: Could not initialize class net.sf.cglib.proxy.Enhancer -

Accessing MATLAB's unicode strings from C -

javascript - mongodb won't find my schema method in nested container -