Celeritas 0.6.0-dev.115+3b60a5fd
Loading...
Searching...
No Matches
Public Member Functions | List of all members
celeritas::StackAllocator< T > Class Template Reference

Dynamically allocate arbitrary data on a stack. More...

#include <StackAllocator.hh>

Public Types

Type aliases
using value_type = T
 
using result_type = value_type *
 
using Data = StackAllocatorData< T, Ownership::reference, MemSpace::native >
 

Public Member Functions

CELER_FUNCTION StackAllocator (Data const &data)
 Construct with defaults.
 
CELER_FUNCTION size_type capacity () const
 Get the maximum number of values that can be allocated.
 
CELER_FUNCTION void clear ()
 Clear the stack allocator.
 
CELER_FUNCTION result_type operator() (size_type count)
 Allocate space for a given number of items.
 
CELER_FUNCTION size_type size () const
 Get the number of items currently present.
 
CELER_FUNCTION Span< value_type > get ()
 View all allocated data.
 
CELER_FUNCTION Span< value_type constget () const
 View all allocated data (const).
 

Detailed Description

template<class T>
class celeritas::StackAllocator< T >

Dynamically allocate arbitrary data on a stack.

The stack allocator view acts as a functor and accessor to the allocated data. It enables very fast on-device dynamic allocation of data, such as secondaries or detector hits. As an example, inside a hypothetical physics Interactor class, you could create two particles with the following code:

struct Interactor
{
// Sample an interaction
template<class Engine>
Interaction operator()(Engine&)
{
// Create 2 secondary particles
Secondary* allocated = this->allocate(2);
if (!allocated)
{
return Interaction::from_failure();
}
Interaction result;
result.secondaries = Span<Secondary>{allocated, 2};
return result;
};
};
Non-owning reference to a contiguous span of data.
Definition Span.hh:49
Dynamically allocate arbitrary data on a stack.
Definition StackAllocator.hh:100
Change in state due to an interaction.
Definition phys/Interaction.hh:31
Span< Secondary > secondaries
Emitted secondaries.
Definition phys/Interaction.hh:43
New particle created via an Interaction.
Definition Secondary.hh:23

A later kernel could then iterate over the secondaries to apply cutoffs:

using SecondaryRef
__global__ apply_cutoff(const SecondaryRef ptrs)
{
StackAllocator<Secondary> allocate(ptrs);
auto secondaries = allocate.get();
if (thread_idx < secondaries.size())
{
Secondary& sec = secondaries[thread_idx];
if (sec.energy < 100 * units::kilo_electron_volts)
{
sec.energy = 0;
}
}
}
static CELER_FUNCTION ThreadId thread_id()
Get the linear thread ID.
Definition KernelParamCalculator.device.hh:164
size_type get() const
Get the ID's value.
Definition OpaqueId.hh:102
CELER_FUNCTION Span< value_type > get()
View all allocated data.
Definition StackAllocator.hh:260
units::MevEnergy energy
New kinetic energy.
Definition Secondary.hh:25
Storage for a stack and its dynamic size.
Definition StackAllocatorData.hh:24

You cannot safely access the current size of the stack in the same kernel that's modifying it – if the stack attempts to allocate beyond the end, then the size() call will reflect that overflowed state, rather than the corrected size reflecting the failed allocation.

A third kernel with a single thread would then be responsible for clearing the data:

__global__ clear_stack(const SecondaryRef ptrs)
{
StackAllocator<Secondary> allocate(ptrs);
if (thread_idx == 0)
{
allocate.clear();
}
}
CELER_FUNCTION void clear()
Clear the stack allocator.
Definition StackAllocator.hh:179

These separate kernel launches are needed as grid-level synchronization points.

Todo:
Instead of returning a pointer, return IdRange<T>. Rename StackAllocatorData to StackAllocation and have it look like a collection so that it will provide access to the data. Better yet, have a StackAllocation that can be a const_reference to the StackAllocatorData. Then the rule will be "you can't create a StackAllocator in the same kernel that you directly access a StackAllocation".

Member Function Documentation

◆ clear()

template<class T >
CELER_FUNCTION void celeritas::StackAllocator< T >::clear ( )
inline

Clear the stack allocator.

This sets the size to zero. It should ideally only be called by a single thread (though multiple threads resetting it should also be OK), but cannot be used in the same kernel that is allocating or viewing it. This is because the access times between different threads or thread-blocks is indeterminate inside of a single kernel.

◆ get() [1/2]

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::get ( )
inline

View all allocated data.

This cannot be called while any running kernel could be modifiying the size.

◆ get() [2/2]

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::get ( ) const
inline

View all allocated data (const).

This cannot be called while any running kernel could be modifiying the size.

◆ operator()()

Allocate space for a given number of items.

Returns NULL if allocation failed due to out-of-memory. Ensures that the shared size reflects the amount of data allocated.

Todo:
It might be useful to set an "out of memory" flag to make it easier for host code to detect whether a failure occurred, rather than looping through primaries and testing for failure.

◆ size()

template<class T >
CELER_FUNCTION auto celeritas::StackAllocator< T >::size ( ) const
inline

Get the number of items currently present.

This value may not be meaningful (may be less than "actual" size) if called in the same kernel as other threads that are allocating.


The documentation for this class was generated from the following file: