Data model¶
Data storage must be isolated from data use for any code that is to run on the device. This allows low-level physics classes to operate on references to data using the exact same device/host code. Furthermore, state data (one per track) and shared data (definitions, persistent data, model data) should be separately allocated and managed.
- Params
Provide a host-side interface to manage and provide access to constant shared GPU data, usually model parameters or the like. The Params class itself can only be accessed via host code. A params class can contain metadata (string names, etc.) suitable for host-side debug output and for helping related classes convert from user-friendly input (e.g. particle name) to device-friendly IDs (e.g., particle ID). These classes should inherit from the
ParamsDataInterfaceclass to define uniform helper methods and types and will often implement the data storage by usingCollectionMirror.- State
Thread-local data specifying the state of a single particle track with respect to a corresponding params class (
FooParams). In the main Celeritas stepping loop, all state data is managed via theCoreStateclass.- View
Device-friendly class that provides read and/or write access to shared and local state data. The name is in the spirit of
std::string_view, which adds functionality to non-owned data. It combines the state variables and model parameters into a single class. The constructor always takes const references to ParamsData and StateData as well as the track slot ID. It encapsulates the storage/layout of the state and parameters, as well as what (if any) data is cached in the state.
Hint
Consider the following example.
All SM physics particles share a common set of properties such as mass and
charge, and each instance of particle has a particular set of
associated variables such as kinetic energy. The shared data (SM parameters)
reside in ParticleParams, and the particle track properties are stored
as part of the core state.
A separate class, the ParticleTrackView, is instantiated with a
specific thread ID so that it acts as an accessor to the
stored data for a particular track. It can calculate properties that depend
on both the state and parameters. For example, momentum depends on both the
mass of a particle (constant, set by the model) and the speed (variable,
depends on particle track state).
Storage¶
- page Collection: a data portability class
The
Collectionmanages data allocation and transfer between CPU and GPU.Its primary design goal is facilitating construction of deeply hierarchical data on host at setup time and seamlessly copying to device. The templated
Tmust be trivially copyable and destructable: either a fundamental data type or a struct of such types. (Some classes in external libraries, such as rocrand’s state types and VecGeom’s NavTuple types, are essentially trivial, but implement null-op destructors or optimized copy constructors, so we allow specialization through the celeritas::TriviallyCopyable class.An individual item in a
Collection<T>can be accessed withItemId<T>, a contiguous subset of items are accessed withItemRange<T>, and the entirety of the data are accessed withAllItems<T>. All three of these classes are trivially copyable, so they can be embedded in structs that can be managed by a Collection. A group of Collections, one for each data type, can therefore be trivially copied to the GPU to enable arbitrarily deep and complex data hierarchies.By convention, groups of Collections comprising the data for a single class or subsystem (such as RayleighInteractor or Physics) are stored in a helper struct suffixed with
Data. For cases where there is both persistent data (problem-specific parameters) and transient data (track-specific states), the collections must be grouped into two separate classes.StateDataare meant to be mutable and never directly copied between host and device; its data collections are typically accessed by thread ID.ParamsDataare immutable and always “mirrored” on both host and device. Sometimes it’s sensible to partitionParamsDatainto discrete helper structs (stored by value), each with a group of collections, and perhaps another struct that has non-templated scalars (since the default assignment operator is less work than manually copying scalars in a templated assignment operator.A collection group has the following requirements to be compatible with the
CollectionMirror(for “params” collection groups),CollectionStateStore(for “state” collection groups”), and other such helper
classes:
- Be a struct templated with <code>template<Ownership W, MemSpace M></code>
- Contain only Collection objects and trivially copyable structs
- Define an operator bool that is true if and only if the class data is
assigned and consistent
- Define a \em templated assignment operator on “other” Ownership and
MemSpace
which assigns every member to the right-hand-side’s member
Additionally, a \c StateData collection group must define
- A member function \c size() returning the number of entries (i.e. number
of threads)
- A free function \c resize with one of three signatures:
@code
void resize(
StateData<Ownership::value, M>* data,
HostCRef<ParamsData> const& params,
StreamId stream,
size_type size);
// or…
void resize(
StateData<Ownership::value, M>* data,
const HostCRef<ParamsData>& params,
size_type size);
// or…
void resize(
StateData<Ownership::value, M>* data,
size_type size);
\endcode
By convention, related groups of collections are stored in a header file
named \c *Data.hh .
See \c ParticleParamsData and \c ParticleStateData for minimal examples of
using collections. The \c MaterialParamsData demonstrates additional
complexity by having a multi-level data hierarchy, and \c MaterialStateData
has a resize function that uses params data. \c PhysicsParamsData is a very
complex example, and \c VecgeomParamsData demonstrates how to use template
specialization to adapt Collections to another codebase with a different
convention for host-device portability.
A common paradigm for managing host-device data is to have a small
fixed-size POD struct called a \em record that contains attributes about an
item. These often need to reference a variable-sized range of data and do so
by storing an \c ItemRange or \c ItemMap . These two types are offsets into
”backend” data stored by a collection group.
-
enum class celeritas::MemSpace
Memory location of data.
Values:
-
enumerator host
CPU memory.
-
enumerator device
GPU memory.
-
enumerator mapped
Unified virtual address space (both host and device)
-
enumerator size_
-
enumerator native
When compiling CUDA files,
deviceelsehost.
-
enumerator host
-
enum class celeritas::Ownership
Data ownership flag.
Values:
-
enumerator value
The collection owns the data.
-
enumerator reference
Mutable reference to data.
-
enumerator const_reference
Immutable reference to data.
-
enumerator value
-
template<class ItemT, class SizeT = ::celeritas::size_type>
class OpaqueId¶ Type-safe index for accessing an array or collection of data.
It’s common for classes and functions to take multiple indices, especially for O(1) indexing for performance. By annotating these values with a type, we give them semantic meaning, and we gain compile-time type safety.
If this class is used for indexing into an array, then
ValueTargument should usually be the value type of the array:Foo operator[](OpaqueId<Foo>)An
OpaqueIdobject evaluates totrueif it has a value, orfalseif it does not (a “null” ID, analogous to a null pointer: it does not correspond to a valid value). A “true” ID will always compare less than a “false” ID: you can usestd::partitionanderaseto remove invalid IDs from a vector.See also
id_castbelow for checked construction of OpaqueIds from generic integer values (avoid compile-time warnings or errors from signed/truncated integers).- Todo:
This interface will be changed to be more like
std::optional:size_typewill becomevalue_type(the value of a ‘dereferenced’ ID) andoperator*orvaluewill be used to access the integer.
- Template Parameters:
ItemT – Type of an item at the index corresponding to this ID
SizeT – Unsigned integer index
-
template<class IdT, class U>
inline IdT celeritas::id_cast(U value) noexcept(!CELERITAS_DEBUG)¶ Safely create an OpaqueId from an integer of any type.
This asserts that the integer is in the valid range of the target ID type, and casts to it.
Note
The value cannot be the underlying “null” value; i.e.
static_cast<FooId>(FooId{}.unchecked_get())will not work.
-
template<class T, class U = size_type>
using celeritas::ItemId = OpaqueId<T, U>¶ Opaque ID representing a single element of a container.
-
template<class T, class Size = size_type>
using celeritas::ItemRange = Range<OpaqueId<T, Size>>¶ Reference a contiguous range of IDs corresponding to a slice of items.
An ItemRange is a range of
OpaqueId<T>that reference a range of values of typeTin aCollection. The ItemRange acts like asliceobject in Python when used on a Collection, returning a Span<T> of the underlying data.An ItemRange is only meaningful in connection with a particular Collection of type T. It doesn’t have any persistent connection to its associated collection and thus must be used carefully.
struct MyMaterial { real_type number_density; ItemRange<ElementComponents> components; }; template<Ownership W, MemSpace M> struct MyData { Collection<ElementComponents, W, M> components; Collection<MyMaterial, W, M> materials; };
- Template Parameters:
T – The value type of items to represent.
-
template<class T1, class T2>
class ItemMap¶ Access data in a Range<T2> with an index of type T1.
Here, T1 and T2 are expected to be OpaqueId types. This is simply a type-safe “offset” with range checking.
Example:
using ElComponentId = OpaqueId<struct ElComp_>; using MatId = OpaqueId<struct MaterialRecord>; // POD struct (record) describing a material struct MaterialRecord { using DoubleId = ItemId<double>; // same as OpaqueId ItemMap<ElComponentId, DoubleId> components; }; template<Ownership W, MemSpace M> struct MatParamsData { Collection<MaterialRecord, W, M> materials; Collection<double, W, M> doubles; // Backend storage // ... };
Here,
componentssemantically refers to a contiguous range of real values in thedoublescollection, whereElComponentId{0}is the first value in that range. Dereferencing the value requires using the map alongside the backend storage:Note that this access requires only two indirections, asdouble get_value(MatParamsData const& params, MatId m, ElComponentId ec) { MaterialRecord const& mat = params.materials[m]; ItemId<double> dbl_id = mat.components[ec]; return params.doubles[dbl_id]; }
ItemMapis merely performing integer arithmetic.
-
template<class T, Ownership W, MemSpace M, class I = ItemId<T>>
class Collection¶ Manage generic array-like data ownership and transfer from host to device.
Data are constructed incrementally on the host, then copied (along with their associated ItemRange) to device. A Collection can act as a
std::vector<T>,DeviceVector<T>,Span<T>, orSpan<const T>. The Spans can point to host or device memory, but theMemSpacetemplate argument protects against accidental accesses from the wrong memory space.Each Collection object is usually accessed with an ItemRange, which references a contiguous set of elements in the Collection. For example, setup code on the host would extend the Collection with a series of vectors, the addition of which returns a ItemRange that returns the equivalent data on host or device. This methodology allows complex nested data structures to be built up quickly at setup time without knowing the size requirements beforehand.
Host-device functions and classes should use
Collectionwith a reference or const_reference Ownership, and theMemSpace::nativetype, which expects device memory when compiled inside a CUDA file and host memory when used inside a C++ source or test. (This design choice prevents a single CUDA file from compiling separate host-compatible and device-compatible compute kernels, but in the case of Celeritas this situation won’t arise, because we always want to build host code in C++ files for development ease and to allow testing when CUDA is disabled.)A
MemSpace::Mappedcollection will be accessible on both host and device. Unified addressing must be supported by the current device, or an exception will be thrown when initializing the collection. Memory pages will reside on in “pinned” memory on host, and each access from device code to a changed page will require a slow memory transfer. Allocating pinned memory is slow and reduces the memory available to the system: so only allocate the smallest amount needed with the longest possible lifetime. Frequently accessing data from device code will result in low performance. Use case for mapped memory are:as a source or destination memory space for asynchronous operations,
on integrated GPU architecture, or
Accessing a
const_referencecollection indevicememory will return a wrapper container that accesses the low-level data through theceleritas::ldgwrapper function, which can accelerate random access on GPU by telling the compiler the memory will not be changed during the lifetime of the kernel. Therefore it is important to only use const Collections for shared, immutable-after-creation “params” data.
-
template<template<Ownership, MemSpace> class P>
class CollectionMirror : public celeritas::ParamsDataInterface<P>¶ Store and reference persistent collection groups on host and device.
This should generally be an implementation detail of Params classes, which are constructed on host and must have the same data both on host and device. The template
Pmust be aFooDataclass that:Is templated on ownership and memory space
Has a templated assignment operator to copy from one space to another
Has a boolean operator returning whether it’s in a valid state.
On assignment, it will copy the data to the device if the GPU is enabled.
- Todo:
Rename ParamsDataStore
- Example:
class FooParams { public: using CollectionDeviceRef = FooData<Ownership::const_reference, MemSpace::device>; const CollectionDeviceRef& device_ref() const { return data_.device_ref(); } private: CollectionMirror<FooData> data_; };
- Template Parameters:
P – Params data collection group
-
template<template<Ownership, MemSpace> class S, MemSpace M>
class CollectionStateStore¶ Store and reference stateful collection groups on host and device.
This can be used for unit tests (MemSpace is host) as well as production code. States generally shouldn’t be copied between host and device, so the only “production use case” construction argument is the size. Other constructors are implemented for convenience in unit tests.
The State class must be templated on ownership and memory space, and additionally must have an operator bool(), a templated operator=, and a size() accessor. It must also define a free function “resize” that takes:
REQUIRED: a pointer to the state with
Ownership::valuesemanticsOPTIONAL: a
Ownership::const_referenceinstance ofMemSpace::hostparams dataOPTIONAL: a
StreamIdfor setting up thread/task-local dataREQUIRED: a
size_typefor specifying the size of the new state.
- Todo:
Rename StateDataStore
- Example:
CollectionStateStore<ParticleStateData, MemSpace::device> pstates( *particle_params, num_tracks); state_data.particle = pstates.ref();
- Template Parameters:
S – State data collection group
Warning
doxygenclass: Cannot find class “celeritas::ldg” in doxygen xml output for project “celeritas” from directory: /home/runner/work/celeritas/celeritas/build/doc/doxygen-xml
Containers¶
These are containers and container-like objects used throughout Celeritas.
-
template<class T, ::celeritas::size_type N>
class Array¶ Three-dimensional cartesian coordinates.
Fixed-size simple array for storage.
The Array class is primarily used for point coordinates (e.g.,
Real3) but is also used for other fixed-size data structures.This isn’t fully standards-compliant with std::array: there’s no support for N=0 for example. Additionally it uses the native celeritas
size_type, even though this has no effect on generated code for values of N inside the range ofsize_type. Arrays are also zero-initialized by default.Note
For supplementary functionality, include:
corecel/math/ArrayUtils.hhfor real-number vector/matrix applicationscorecel/math/ArrayOperators.hhfor mathematical operatorsArrayIO.hhfor streaming and string conversionArrayIO.json.hhfor JSON input and output
-
template<class E, class T>
class EnumArray¶ Thin wrapper for an array of enums for accessing by enum instead of int.
The enum must be a zero-indexed contiguous enumeration with a
size_enumeration as its last value.- Todo:
The template parameters are reversed!!!
-
template<class T>
class Range¶ Proxy container for iterating over a range of integral values.
Here, T can be any of:
an integer,
an enum that has contiguous zero-indexed values and a “size_” enumeration value indicating how many, or
an OpaqueId.
It is OK to dereference the end iterator! The result should just be the off-the-end value for the range, e.g.
FooEnum::size_orbar.size().
-
template<class T, std::size_t Extent = dynamic_extent>
class Span¶ Non-owning reference to a contiguous span of data.
This Span class is a modified backport of the C++20
std::span. In Celeritas, it is often used as a return value from accessing elements in aCollection.Like the celeritas::Array , this class is not 100% compatible with the
std::spanclass. The hope is that it will be complete and correct for the use cases needed by Celeritas (and, as a bonus, it will be device-compatible).Spancan be instantiated with the special marker typeLdgValue<T>to optimize constant data access in global device memory. In that case, data returned byfront,back,operator[] andbegin/enditerators use value semantics instead of reference. Thedataaccessor still returns a pointer to the underlying memory and can be used to bypass usingLdgIterator.- Template Parameters:
T – value type
Extent – fixed size; defaults to dynamic.
Auxiliary storage¶
Users and other parts of the code can add their own shared and stream-local
(i.e., thread-local) data to Celeritas using the
celeritas::AuxParamsInterface and
celeritas::AuxStateInterface classes, accessed through the
celeritas::AuxParamsRegistry and
celeritas::AuxStateVec classes, respectively.
-
class AuxParamsInterface¶
Base class for extensible shared data that has associated state.
Auxiliary data can be added to a
AuxParamsInterfaceat runtime to be passed among multiple classes, and thendynamic_castto the expected type. It needs to supply a factory function for creating the a state instance for multithreaded data on a particular stream and a given memory space. Classes can inherit both fromAuxParamsInterfaceand otherActionInterfaceclasses.Subclassed by celeritas::AuxParams< StatusCheckParamsData, StatusCheckStateData >, celeritas::ActionTimes, celeritas::AuxParams< P, S >, celeritas::ExtendFromPrimariesAction, celeritas::OffloadGatherAction, celeritas::SlotDiagnostic, celeritas::optical::GeneratorBase
-
class AuxStateInterface¶
Auxiliary state data owned by a single stream.
This interface class is strictly to allow polymorphism and dynamic casting. It does not include attributes like size or memspace, because not all use cases require it.
Subclassed by celeritas::ActionTimesState, celeritas::AuxState< S, M >, celeritas::GeneratorStateBase, celeritas::PrimaryStateData< M >, celeritas::optical::CoreStateInterface
-
class AuxParamsRegistry¶
Manage auxiliary parameter classes.
This class keeps track of
AuxParamsInterfaceclasses.
-
class AuxStateVec¶
Manage single-stream auxiliary state data.
This class is constructed from a
AuxParamsRegistryafter the params are completely added and while the state is being constructed (with its size, etc.). The AuxId for an element of this class corresponds to the AuxParamsRegistry.This class can be empty either by default or if the given auxiliary registry doesn’t have any entries.
Auxiliary collection groups¶
-
template<template<Ownership, MemSpace> class P, template<Ownership, MemSpace> class S>
class AuxParams : public celeritas::AuxParamsInterface, public celeritas::ParamsDataInterface<P>¶ Construct and manage portable dynamic data.
This generalization of the Celeritas data model manages some of the boilerplate code for the common use case of having portable “params” data (e.g., model data) and “state” data (e.g., temporary values used across multiple kernels or processed into user space). Each state/stream will have an instance of
AuxStateaccessible by this class. An instance of this class can be shared among multiple actions, or an action could inherit from it.- Example:
The
StepParamsinherits from this class to provide access to host and state data. The execution insideStepGatherActionprovides views to both the params and state data classes:// Extract the local step state data auto const& step_params = params_->ref<MemSpace::native>(); auto& step_state = params_->ref<MemSpace::native>(state.aux()); // Run the action auto execute = TrackExecutor{ params.ptr<MemSpace::native>(), state.ptr(), detail::StepGatherExecutor<P>{step_params, step_state}};
Note
For the case where the aux state data contains host-side classes and data (e.g., an open file handle) you must manually set up the params/state data using
AuxStateInterfaceandAuxParamsInterface.- Template Parameters:
P – Params collection group
S – State collection group