Celeritas  0.5.0-86+4a8eea4
Classes | Public Member Functions | Static Public Member Functions | List of all members
celeritas::KernelParamCalculator Class Reference

Kernel management helper functions. More...

#include <KernelParamCalculator.device.hh>

Classes

struct  LaunchParams
 Parameters needed for a CUDA lauch call. More...
 

Public Types

Type aliases
using dim_type = unsigned int
 

Public Member Functions

template<class F >
 KernelParamCalculator (std::string_view name, F *kernel_func_ptr)
 Construct with the maximum threads per block for a given kernel.
 
template<class F >
 KernelParamCalculator (std::string_view name, F *kernel_func_ptr, dim_type threads_per_block)
 Construct for the given global kernel F. More...
 
LaunchParams operator() (size_type min_num_threads) const
 Calculate launch params given the number of threads.
 

Static Public Member Functions

static CELER_FUNCTION ThreadId thread_id ()
 Get the linear thread ID.
 

Detailed Description

Kernel management helper functions.

We assume that all our kernel launches use 1-D thread indexing to make things easy. The dim_type alias should be the same size as the type of a single dim3 member (x/y/z).

Constructing the param calculator registers kernel attributes with kernel_registry as an implementation detail in the .cc file that hides inclusion of that interface from CUDA code. If kernel diagnostic profiling is enabled, the registry will return a pointer that this class uses to increment thread launch counters over the lifetime of the program.

static KernelParamCalculator calc_params("my", &my_kernel);
auto params = calc_params(states.size());
my_kernel<<<params.blocks_per_grid,
params.threads_per_block>>>(kernel_args...);
KernelParamCalculator(std::string_view name, F *kernel_func_ptr)
Construct with the maximum threads per block for a given kernel.
Definition: KernelParamCalculator.device.hh:179

Constructor & Destructor Documentation

◆ KernelParamCalculator()

template<class F >
celeritas::KernelParamCalculator::KernelParamCalculator ( std::string_view  name,
F *  kernel_func_ptr,
dim_type  threads_per_block 
)
inline

Construct for the given global kernel F.

This registers the kernel with celeritas::kernel_registry() and saves a pointer to the profiling data if profiling is to be used.


The documentation for this class was generated from the following files: