c++ - How not to repeat myself without macros when writing similar CUDA kernels? -
i have several cuda kernels doing same variations. reduce amout of code needed. first thought use macros, resulting kernels (simplified):
__global__ void kernela( ... ) { init(); // macro initialize variables // specific stuff kernela b = + c; end(); // macro write result } __global__ void kernelb( ... ) { init(); // macro initialize variables // specific stuff kernelb b = - c; end(); // macro write result } ...
since macros nasty, ugly , evil looking better , cleaner way. suggestions?
(a switch statement not job: in reality, parts same , parts kernel specific pretty interweaved. several switch statements needed make code pretty unreadable. furthermore, function calls not initialize needed variables. )
(this question might answerable general c++ well, replace 'cuda kernel' 'function' , remove '__global__' )
updated: told in comments, classes , inheritance don't mix cuda. therefore first part of answer applies cuda, while others answer more general c++ part of question.
for cuda, have use pure functions, "c-style":
struct kernelvars { int a; int b; int c; }; __device__ void init(kernelvars& vars) { init(); //whatever actual code } __device__ void end(kernelvars& vars) { end(); //whatever actual code } __global__ void kernela(...) { kernelvars vars; init(vars); b = + c; end(vars); }
this answer general c++, use oop techniques constructors , destructors (they suited init/end pairs), or template method pattern can used other languages well:
using ctor/dtor , templates, "c++ style":
class kernelbase { protected: int a, b, c; public: kernelbase() { init(); //replace contents of macro } ~kernelbase() { end(); //replace contents of macro } virtual void run() = 0; }; struct kerneladd : kernelbase { void run() { b = + c; } }; struct kernelsub : kernelbase { void run() { b = - c; } }; template<class k> void kernel(...) { k k; k.run(); } void kernela( ... ) { kernel<kerneladd>(); }
using template method pattern, general "oop style"
class kernelbase { virtual void do_run() = 0; protected: int a, b, c; public: void run() { //the template method init(); do_run(); end(); } }; struct kerneladd : kernelbase { void do_run() { b = + c; } }; struct kernelsub : kernelbase { void do_run() { b = - c; } }; void kernela(...) { kerneladd k; k.run(); }
Comments
Post a Comment