Celeritas 0.6.0-dev.115+3b60a5fd
|
Set up per-process state/buffer capacities. More...
#include <Control.hh>
Public Attributes | |
size_type | primaries {} |
Maximum number of primaries that can be buffered before stepping. | |
size_type | initializers {} |
Maximum number of queued primaries+secondaries. | |
size_type | tracks {} |
Maximum number of track slots to be simultaneously stepped. | |
std::optional< size_type > | secondaries |
Maximum number of secondaries created per step. | |
size_type | events {0} |
Maximum number of simultaneous events (zero for Geant4 integration) | |
Set up per-process state/buffer capacities.
Increasing these values increases resource requirements with the trade-off of (usually!) improving performance. A larger number of tracks
in flight means improved performance on GPU because the standard kernel size increases, but it also means higher memory usage because of the larger number of full states. More initializers
are necessary for more (and higher-energy) tracks when lots of particles are in flight and producing new child particles. More secondaries
may be necessary if physical processes that produce many daughters (e.g., atomic relaxation or Bertini cascade) are active. The number of events
in flight primarily increases the number of active tracks, possible initializers, and produced secondaries (NOTE: see #1233 ). Finally, the number of primaries
is the maximum number of pending tracks from an external application before running a kernel to construct initializers
and execute the stpeping loop.
Capacities are defined as the number per application process (task): this means that in a multithreaded context it implies "strong scaling" (i.e., the allocations are divided among threads), and in a multiprocess context it implies "weak scaling" (the problem size grows with the number of processes). In other words, if used in a multithread "event-parallel" context, each state gets the specified tracks
divided by the number of threads. When used in MPI parallel (e.g., one process per GPU), each process rank has tracks
total threads.
primaries
was previously named auto_flush
. SetupOptions
and celer-g4
treated these quantities as "per stream" whereas celer-sim
used "per process".Defaults:
secondary:
twice the number of track slots.
Split this into "core" state capacity and "optical" state capacity? Core contains events
and secondaries
.
Instead of a special value events=0
, make a variant or something more descriptive?
Some of these parameters will be more automated in the future.