Celeritas  0.5.0-56+6b053cd
Public Member Functions | List of all members
celeritas::StackAllocator< T > Class Template Reference

Dynamically allocate arbitrary data on a stack. More...

#include <StackAllocator.hh>

Public Types

Type aliases
using value_type = T
 
using result_type = value_type *
 
using Data = StackAllocatorData< T, Ownership::reference, MemSpace::native >
 

Public Member Functions

CELER_FUNCTION StackAllocator (Data const &data)
 Construct with defaults.
 
CELER_FUNCTION size_type capacity () const
 Get the maximum number of values that can be allocated.
 
CELER_FUNCTION void clear ()
 Clear the stack allocator. More...
 
CELER_FUNCTION result_type operator() (size_type count)
 Allocate space for a given number of items. More...
 
CELER_FUNCTION size_type size () const
 Get the number of items currently present. More...
 
CELER_FUNCTION Span< value_type > get ()
 View all allocated data. More...
 
CELER_FUNCTION Span< value_type const > get () const
 View all allocated data (const). More...
 

Detailed Description

template<class T>
class celeritas::StackAllocator< T >

Dynamically allocate arbitrary data on a stack.

The stack allocator view acts as a functor and accessor to the allocated data. It enables very fast on-device dynamic allocation of data, such as secondaries or detector hits. As an example, inside a hypothetical physics Interactor class, you could create two particles with the following code:

struct Interactor
{
StackAllocator<Secondary> allocate;
// Sample an interaction
template<class Engine>
Interaction operator()(Engine&)
{
// Create 2 secondary particles
Secondary* allocated = this->allocate(2);
if (!allocated)
{
}
Interaction result;
result.secondaries = Span<Secondary>{allocated, 2};
return result;
};
};
CELER_FUNCTION result_type operator()(size_type count)
Allocate space for a given number of items.
Definition: StackAllocator.hh:194
static CELER_FUNCTION Interaction from_failure()
Indicate a failure to allocate memory for secondaries.
Definition: phys/Interaction.hh:139

A later kernel could then iterate over the secondaries to apply cutoffs:

using SecondaryRef
= StackAllocatorData<Secondary, Ownership::reference, MemSpace::device>;
__global__ apply_cutoff(const SecondaryRef ptrs)
{
StackAllocator<Secondary> allocate(ptrs);
auto secondaries = allocate.get();
if (thread_idx < secondaries.size())
{
Secondary& sec = secondaries[thread_idx];
if (sec.energy < 100 * units::kilo_electron_volts)
{
sec.energy = 0;
}
}
}
static CELER_FUNCTION ThreadId thread_id()
Get the linear thread ID.
Definition: KernelParamCalculator.device.hh:164
size_type get() const
Get the ID's value.
Definition: OpaqueId.hh:103

You cannot safely access the current size of the stack in the same kernel that's modifying it – if the stack attempts to allocate beyond the end, then the size() call will reflect that overflowed state, rather than the corrected size reflecting the failed allocation.

A third kernel with a single thread would then be responsible for clearing the data:

__global__ clear_stack(const SecondaryRef ptrs)
{
StackAllocator<Secondary> allocate(ptrs);
if (thread_idx == 0)
{
allocate.clear();
}
}

These separate kernel launches are needed as grid-level synchronization points.

Todo:
Instead of returning a pointer, return IdRange<T>. Rename StackAllocatorData to StackAllocation and have it look like a collection so that it will provide access to the data. Better yet, have a StackAllocation that can be a const_reference to the StackAllocatorData. Then the rule will be "you can't create a StackAllocator in the same kernel that you directly access a StackAllocation".

Member Function Documentation

◆ clear()

template<class T >
CELER_FUNCTION void celeritas::StackAllocator< T >::clear
inline

Clear the stack allocator.

This sets the size to zero. It should ideally only be called by a single thread (though multiple threads resetting it should also be OK), but cannot be used in the same kernel that is allocating or viewing it. This is because the access times between different threads or thread-blocks is indeterminate inside of a single kernel.

◆ get() [1/2]

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::get
inline

View all allocated data.

This cannot be called while any running kernel could be modifiying the size.

◆ get() [2/2]

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::get
inline

View all allocated data (const).

This cannot be called while any running kernel could be modifiying the size.

◆ operator()()

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::operator() ( size_type  count)
inline

Allocate space for a given number of items.

Returns NULL if allocation failed due to out-of-memory. Ensures that the shared size reflects the amount of data allocated.

Todo:
It might be useful to set an "out of memory" flag to make it easier for host code to detect whether a failure occurred, rather than looping through primaries and testing for failure.

◆ size()

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::size
inline

Get the number of items currently present.

This value may not be meaningful (may be less than "actual" size) if called in the same kernel as other threads that are allocating.


The documentation for this class was generated from the following file: