|
Celeritas 0.7.0-dev.164+develop.929c81eeb
|
Kernel management helper functions. More...
#include <KernelParamCalculator.device.hh>
Classes | |
| struct | LaunchParams |
| Parameters needed for a CUDA launch call. More... | |
Public Types | |
Type aliases | |
| using | dim_type = unsigned int |
Public Member Functions | |
| template<class F > | |
| KernelParamCalculator (std::string_view name, F *kernel_func_ptr) | |
| Construct with the maximum threads per block for a given kernel. | |
| template<class F > | |
| KernelParamCalculator (std::string_view name, F *kernel_func_ptr, dim_type threads_per_block) | |
| Construct for the given global kernel F. | |
| LaunchParams | operator() (size_type min_num_threads) const |
| Calculate launch params given the number of threads. | |
Static Public Member Functions | |
| static CELER_FUNCTION ThreadId | thread_id () |
| Get the linear thread ID. | |
Kernel management helper functions.
We assume that all our kernel launches use 1-D thread indexing to make things easy. The dim_type alias should be the same size as the type of a single dim3 member (x/y/z).
Constructing the param calculator registers kernel attributes with kernel_registry as an implementation detail in the .cc file that hides inclusion of that interface from CUDA code. If kernel diagnostic profiling is enabled, the registry will return a pointer that this class uses to increment thread launch counters over the lifetime of the program.
|
inline |
Construct for the given global kernel F.
This registers the kernel with celeritas::kernel_registry() and saves a pointer to the profiling data if profiling is to be used.