Celeritas
0.5.0-86+4a8eea4
|
Kernel management helper functions. More...
#include <KernelParamCalculator.device.hh>
Classes | |
struct | LaunchParams |
Parameters needed for a CUDA lauch call. More... | |
Public Types | |
Type aliases | |
using | dim_type = unsigned int |
Public Member Functions | |
template<class F > | |
KernelParamCalculator (std::string_view name, F *kernel_func_ptr) | |
Construct with the maximum threads per block for a given kernel. | |
template<class F > | |
KernelParamCalculator (std::string_view name, F *kernel_func_ptr, dim_type threads_per_block) | |
Construct for the given global kernel F. More... | |
LaunchParams | operator() (size_type min_num_threads) const |
Calculate launch params given the number of threads. | |
Static Public Member Functions | |
static CELER_FUNCTION ThreadId | thread_id () |
Get the linear thread ID. | |
Kernel management helper functions.
We assume that all our kernel launches use 1-D thread indexing to make things easy. The dim_type
alias should be the same size as the type of a single dim3
member (x/y/z).
Constructing the param calculator registers kernel attributes with kernel_registry
as an implementation detail in the .cc file that hides inclusion of that interface from CUDA code. If kernel diagnostic profiling is enabled, the registry will return a pointer that this class uses to increment thread launch counters over the lifetime of the program.
|
inline |
Construct for the given global kernel F.
This registers the kernel with celeritas::kernel_registry()
and saves a pointer to the profiling data if profiling is to be used.