Core package#
The corecel
directory contains functionality shared by Celeritas and ORANGE
primarily pertaining to GPU abstractions.
Configuration#
The celeritas_config.h
configure file contains all-caps definitions of the
CMake configuration options as 0/1 defines so they can be used with if
constexpr
and other C++ expressions. The celeritas_cmake_strings.h
defines static C strings with configuration options such as key dependent
library versions. Finally, celeritas_version.h
defines version numbers as
a preprocessor definition, a set of integers, and a descriptive string.
Defines
-
CELERITAS_VERSION#
Celeritas version.
Encoded as a big-endian hexidecimal with one byte per component: (major * 256 + minor) * 256 + patch.
Variables
-
static const char celeritas_version[] = "0.4.2"#
Celeritas version string with git metadata.
-
static const int celeritas_version_major = 0#
Celeritas major version.
-
static const int celeritas_version_minor = 4#
Celeritas minor version.
-
static const int celeritas_version_patch = 2#
Celeritas patch version.
Fundamentals#
Language and compiler abstraction macro definitions.
The Macros file defines cross-platform (CUDA, C++, HIP) macros that expand to attributes depending on the compiler and build configuration.
Defines
-
CELER_FUNCTION#
Decorate a function that works on both host and device, with and without NVCC.
The name of this function and its siblings is based on the Kokkos naming scheme.
-
CELER_FORCEINLINE#
-
CELER_FORCEINLINE_FUNCTION#
Like CELER_FUNCTION but forces inlining.
Compiler optimizers usually can tell what needs optimizing, but this function can provide speedups (and smaller sampling profiles) when inlining optimizations are not enabled. It should be used sparingly.
-
CELER_CONSTEXPR_FUNCTION#
Decorate a function that works on both host and device, with and without NVCC, can be evaluated at compile time, and should be forcibly inlined.
-
CELER_UNLIKELY(COND)#
Mark the result of this condition to be “unlikely”.
This asks the compiler to move the section of code to a “cold” part of the instructions, improving instruction locality. It should be used primarily for error checking conditions.
-
CELER_UNREACHABLE#
Mark a point in code as being impossible to reach in normal execution.
See https://clang.llvm.org/docs/LanguageExtensions.html#builtin-unreachable or https://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx or https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#__builtin_unreachable
(The “unreachable” and “assume” compiler optimizations for CUDA are only available in API version 11.3 or higher, which is encoded as
).major*1000 + minor*10
Note
This macro should not generally be used; instead, the macro
CELER_ASSERT_UNREACHABLE()
defined in base/Assert.hh should be used instead (to provide a more detailed error message in case the point is reached).
-
CELER_USE_DEVICE#
True if HIP or CUDA are enabled, false otherwise.
-
CELER_DEVICE_SOURCE#
Defined and true if building a HIP or CUDA source file.
This is a generic replacement for
__CUDACC__
.
-
CELER_DEVICE_COMPILE#
Defined and true if building device code in HIP or CUDA.
This is a generic replacement for
__CUDA_ARCH__
.
-
CELER_DEVICE_PREFIX(TOK)#
Add a prefix “hip” or “cuda” to a code token.
-
CELER_TRY_HANDLE(STATEMENT, HANDLE_EXCEPTION)#
“Try” to execute the statement, and “handle” all thrown errors by calling the given function-like error handler with a
std::exception_ptr
object.Note
A file that uses this macro must include the
<exception>
header (but since theHANDLE_EXCEPTION
needs to take an exception pointer, it’s got to be included anyway).
-
CELER_TRY_HANDLE_CONTEXT(STATEMENT, HANDLE_EXCEPTION, CONTEXT_EXCEPTION)#
Try the given statement, and if it fails, chain it into the given exception.
The given
CONTEXT_EXCEPTION
must be an expression that yields an rvalue to astd::exception
subclass that isn’tfinal
. The resulting chained exception will be passed intoHANDLE_EXCEPTION
for processing.
-
CELER_DEFAULT_COPY_MOVE(CLS)#
Explicitly declare defaulted copy and move constructors and assignment operators.
Use this if the destructor is declared explicitly.
-
CELER_DELETE_COPY_MOVE(CLS)#
Explicitly declare deleted copy and move constructors and assignment operators.
Use this for scoped RAII classes.
-
CELER_DISCARD(CODE)#
The argument is an unevaluated operand which will generate no code but force the expression to be used.
This is used in place of the
attribute, which actually generates warnings in older versions of GCC.[[maybe_unused]]
Celeritas assertions are only enabled when the CELERITAS_DEBUG
configuration option is set. The macros CELER_EXPECT
, CELER_ASSERT
, and
CELER_ENSURE
correspond to “precondition contract”, “internal assertion”,
and “postcondition contract”.
Macros, exceptions, and helpers for assertions and error handling.
This defines host- and device-compatible assertion macros that are toggled on the CELERITAS_DEBUG
configure macro.
Defines
-
CELER_DEBUG_FAIL(MSG, WHICH)#
Throw a debug assertion regardless of the
CELERITAS_DEBUG
setting.This is used internally but is also useful for catching subtle programming errors in downstream code.
-
CELER_EXPECT(COND)#
Precondition debug assertion macro.
We “expect” that the input values or initial state satisfy a precondition, and we throw exception in debug mode if they do not.
-
CELER_ASSERT(COND)#
Internal debug assertion macro.
This replaces standard
assert
usage.
-
CELER_ENSURE(COND)#
Postcondition debug assertion macro.
Use to “ensure” that return values or side effects are as expected when leaving a function.
-
CELER_ASSUME(COND)#
Always-on compiler assumption.
This should be used very rarely: you should make sure the resulting assembly is simplified in optimize mode from using the assumption. For example, sometimes informing the compiler of an assumption can reduce code bloat by skipping standard library exception handling code (e.g. in
std::visit
by assuming!var_obj
.valueless_by_exception() ).
-
CELER_ASSERT_UNREACHABLE()#
Throw an assertion if the code point is reached.
When debug assertions are turned off, this changes to a compiler hint that improves optimization (and may force the coded to exit uncermoniously if the point is encountered, rather than continuing on with undefined behavior).
-
CELER_VALIDATE(COND, MSG)#
Always-on runtime assertion macro.
This can check user input and input data consistency, and will raise RuntimeError on failure with a descriptive error message that is streamed as the second argument. This macro cannot be used in
__device__
-annotated code.The error message should read:
"<PROBLEM> (<WHY IT'S A PROBLEM>) <SUGGESTION>?"
Examples with correct casing and punctuation:
”failed to open ‘{filename}’ (should contain relaxation data)”
”unexpected end of file ‘{filename}’ (data is inconsistent with
boundaries)”
”MPI was not initialized (needed to construct a communicator). Maybe set
the environment variable CELER_DISABLE_PARALLEL=1 to disable
externally?”
”invalid min_range={opts.min_range} (must be positive)”
This looks in pracice like:
CELER_VALIDATE(file_stream, << "failed to open '" << filename << "' (should contain relaxation data)");
An always-on debug-type assertion without a detailed message can be constructed by omitting the stream (but leaving the comma):
CELER_VALIDATE(file_stream,);
-
CELER_NOT_CONFIGURED(WHAT)#
Assert if the code point is reached because an optional feature is disabled.
This generally should be used for the constructors of dummy class definitions in, e.g.,
Foo.nocuda.cc
:Foo::Foo() { CELER_NOT_CONFIGURED("CUDA"); }
-
CELER_NOT_IMPLEMENTED(WHAT)#
Assert if the code point is reached because a feature has yet to be fully implemented.
This placeholder is so that code paths can be “declared but not defined” and implementations safely postponed in a greppable manner. This should not be used to define “unused” overrides for virtual classes. A correct use case would be:
if (z > AtomicNumber{26}) { CELER_NOT_IMPLEMENTED("physics for heavy nuclides"); }
-
CELER_CUDA_CALL(STATEMENT)#
When CUDA support is enabled, execute the wrapped statement and throw a RuntimeError if it fails.
If CUDA is disabled, throw an unconfigured assertion.
If it fails, we call
cudaGetLastError
to clear the error code. Note that this will not clear the code in a few fatal error cases (kernel assertion failure, invalid memory access) and all subsequent CUDA calls will fail.CELER_CUDA_CALL(cudaMalloc(&ptr_gpu, 100 * sizeof(float))); CELER_CUDA_CALL(cudaDeviceSynchronize());
Note
A file that uses this macro must include
corecel/device_runtime_api.h
.
-
CELER_HIP_CALL(STATEMENT)#
When HIP support is enabled, execute the wrapped statement and throw a RuntimeError if it fails.
If HIP is disabled, throw an unconfigured assertion.
If it fails, we call
hipGetLastError
to clear the error code.CELER_HIP_CALL(hipMalloc(&ptr_gpu, 100 * sizeof(float))); CELER_HIP_CALL(hipDeviceSynchronize());
Note
A file that uses this macro must include
corecel/device_runtime_api.h
. Theceleritas_device_runtime_api_h_
declaration enforces this when HIP is disabled.
-
CELER_DEVICE_CALL_PREFIX(STMT)#
Prepend the argument with “cuda” or “hip” and call with the appropriate checking statement as above.
Example:
CELER_DEVICE_CALL_PREFIX(Malloc(&ptr_gpu, 100 * sizeof(float))); CELER_DEVICE_CALL_PREFIX(DeviceSynchronize());
Note
A file that uses this macro must include
corecel/device_runtime_api.h
. Theceleritas_device_runtime_api_h_
declaration enforces this when CUDA/HIP are disabled.
-
CELER_DEVICE_CHECK_ERROR()#
After a kernel launch or other call, check that no CUDA errors have occurred.
This is also useful for checking success after external CUDA libraries have been called.
-
CELER_MPI_CALL(STATEMENT)#
When MPI support is enabled, execute the wrapped statement and throw a RuntimeError if it fails.
If MPI is disabled, throw an unconfigured assertion.
Note
A file that uses this macro must include
mpi.h
.
-
namespace celeritas
Enums
-
enum class DebugErrorType#
Values:
-
enumerator precondition#
Precondition contract violation.
-
enumerator internal#
Internal assertion check failure.
-
enumerator unreachable#
Internal assertion: unreachable code path.
-
enumerator unconfigured#
Internal assertion: required feature not enabled.
-
enumerator unimplemented#
Internal assertion: not yet implemented.
-
enumerator postcondition#
Postcondition contract violation.
-
enumerator assumption#
“Assume” violation
-
enumerator precondition#
Functions
-
inline void unreachable()#
Invoke undefined behavior.
-
class DebugError : public std::logic_error#
- #include <Assert.hh>
Error thrown by Celeritas assertions.
Public Functions
-
explicit DebugError(DebugErrorDetails)#
Construct a debug exception from detailed attributes.
-
inline DebugErrorDetails const &details() const#
Access the debug data.
Private Members
-
DebugErrorDetails details_#
-
explicit DebugError(DebugErrorDetails)#
-
struct DebugErrorDetails#
- #include <Assert.hh>
Detailed properties of a debug assertion failure.
-
class RichContextException : public std::exception#
- #include <Assert.hh>
Base class for writing arbitrary exception context to JSON.
This can be overridden in higher-level parts of the code for specific needs (e.g., writing thread, event, and track contexts in Celeritas solver kernels). Note that in order for derived classes to work with
std::throw_with_nested
, they MUST NOT befinal
.Subclassed by celeritas::IPAContextException, celeritas::KernelContextException
-
class RuntimeError : public std::runtime_error#
- #include <Assert.hh>
Error thrown by working code from unexpected runtime conditions.
Public Functions
-
explicit RuntimeError(RuntimeErrorDetails)#
Construct a runtime error from detailed descriptions.
-
inline RuntimeErrorDetails const &details() const#
Access detailed information.
Public Static Functions
-
static RuntimeError from_validate(std::string msg, char const *code, char const *file, int line)#
Construct a runtime exception from a validation failure.
-
static RuntimeError from_device_call(char const *error_string, char const *code, char const *file, int line)#
Construct a runtime exception from a CUDA/HIP runtime failure.
-
static RuntimeError from_mpi_call(int errorcode, char const *code, char const *file, int line)#
Construct a message and throw an error from a runtime MPI failure.
-
static RuntimeError from_geant_exception(char const *origin, char const *code, char const *desc)#
Construct an error message from a Geant4 exception.
- Parameters:
origin – Usually the function that throws
code – A computery error code
desc – Description of the failure
-
static RuntimeError from_root_error(char const *origin, char const *msg)#
Construct an error message from a Geant4 exception.
- Parameters:
origin – Usually the function that throws
msg – Description of the failure
Private Members
-
RuntimeErrorDetails details_#
-
explicit RuntimeError(RuntimeErrorDetails)#
-
struct RuntimeErrorDetails#
- #include <Assert.hh>
Detailed properties of a runtime error.
Public Members
-
RuntimeErrorType which = {RuntimeErrorType::validate}#
-
std::string what = {}#
-
std::string condition = {}#
-
std::string file = {}#
-
int line = {0}#
-
RuntimeErrorType which = {RuntimeErrorType::validate}#
-
enum class DebugErrorType#
Type definitions for common Celeritas functionality.
-
namespace celeritas
Convenience typedefs for params and states.
-
template<template<Ownership, MemSpace> class P>
using HostVal = P<Ownership::value, MemSpace::host># Managed host memory.
-
template<template<Ownership, MemSpace> class P>
using HostCRef = P<Ownership::const_reference, MemSpace::host># Immutable reference to host memory.
-
template<template<Ownership, MemSpace> class S>
using HostRef = S<Ownership::reference, MemSpace::host># Mutable reference to host memory.
-
template<template<Ownership, MemSpace> class P>
using DeviceCRef = P<Ownership::const_reference, MemSpace::device># Immutable reference to device memory.
-
template<template<Ownership, MemSpace> class S>
using DeviceRef = S<Ownership::reference, MemSpace::device># Mutable reference to device memory.
-
template<template<Ownership, MemSpace> class P>
using NativeCRef = P<Ownership::const_reference, MemSpace::native># Immutable reference to native memory.
-
template<template<Ownership, MemSpace> class S>
using NativeRef = S<Ownership::reference, MemSpace::native># Mutable reference to native memory.
Typedefs
-
using size_type = unsigned int#
Standard type for container sizes, optimized for GPU use.
-
using real_type = double#
Numerical type for real numbers.
-
using ull_int = unsigned long long int#
Equivalent to std::size_t but compatible with CUDA atomics.
-
template<template<Ownership, MemSpace> class P>
-
template<class ValueT, class SizeT = ::celeritas::size_type>
class OpaqueId# Type-safe index for accessing an array.
This allows type-safe, read-only indexing/access for a class. The value is ‘true’ if it’s assigned, ‘false’ if invalid.
- Template Parameters:
ValueT – Type of each item in the array.
SizeT – Integer index
System#
-
namespace celeritas
-
class Device#
- #include <Device.hh>
Manage attributes of the GPU.
CUDA/HIP translation table:
CUDA/NVIDIA
HIP/AMD
Description
thread
work item
individual local work element
warp
wavefront
“vectorized thread” operating in lockstep
block
workgroup
group of threads able to sync
multiprocessor
compute unit
hardware executing one or more blocks
multiprocessor
execution unit
hardware executing one or more warps
Each block/workgroup operates on the same hardware (compute unit) until completion. Similarly, a warp/wavefront is tied to a single execution unit. Each compute unit can execute one or more blocks: the higher the number of blocks resident, the more latency can be hidden.
Warning
The current multithreading/multiprocess model is intended to have one GPU serving multiple CPU threads simultaneously, and one MPI process per GPU. The active CUDA device is a static thread-local property but
global_device
is global. CUDA needs to be activated usingactivate_device
oractivate_device_local
on every thread, using the same device ID.Type aliases
-
using MapStrInt = std::map<std::string, int>#
Public Functions
-
Device() = default#
-
explicit Device(int id)#
Construct from a device ID.
-
inline int device_id() const#
Get the CUDA device ID, if active.
-
inline explicit operator bool() const#
True if device is initialized.
-
inline std::size_t total_global_mem() const#
Total memory capacity (bytes)
-
inline int max_threads_per_block() const#
Maximum number of threads per block (for launch limits)
-
inline int max_blocks_per_grid() const#
Maximum number of threads per block (for launch limits)
-
inline int max_threads_per_cu() const#
Maximum number of concurrent threads per compute unit (for occupancy)
-
inline unsigned int threads_per_warp() const#
Number of threads per warp.
-
inline bool can_map_host_memory() const#
Whether the device supports mapped pinned memory.
-
inline unsigned int eu_per_cu() const#
Number of execution units per compute unit (1 for NVIDIA, 4 for AMD)
-
StreamId::size_type num_streams() const#
Number of streams allocated.
-
void create_streams(unsigned int num_streams) const#
Allocate the given number of streams.
If no streams have been created, the default stream will be used.
-
Stream &stream(StreamId) const#
Access a stream.
This returns the default stream if no streams were allocated.
Public Static Functions
-
static int num_devices()#
Get the number of available devices.
This is nonzero if and only if CUDA support is built-in, if at least one CUDA-capable device is present, and if the
CELER_DISABLE_DEVICE
environment variable is not set.
-
static bool debug()#
Whether verbose messages and error checking are enabled.
This is true if
CELERITAS_DEBUG
is set or if theCELER_DEBUG_DEVICE
environment variable exists and is not empty.
Private Types
-
using UPStreamStorage = std::unique_ptr<detail::StreamStorage, StreamStorageDeleter>#
Private Members
-
int id_ = {-1}#
-
std::string name_ = {"<DISABLED>"}#
-
std::size_t total_global_mem_ = {}#
-
int max_threads_per_block_ = {}#
-
int max_blocks_per_grid_ = {}#
-
int max_threads_per_cu_ = {}#
-
unsigned int threads_per_warp_ = {}#
-
bool can_map_host_memory_ = {}#
-
unsigned int eu_per_cu_ = {}#
-
UPStreamStorage streams_#
-
struct StreamStorageDeleter#
-
using MapStrInt = std::map<std::string, int>#
-
namespace detail#
-
class Device#
Containers#
-
template<class T, ::celeritas::size_type N>
struct Array# Fixed-size simple array for storage.
This isn’t fully standards-compliant with std::array: there’s no support for N=0 for example. Additionally it uses the native celeritas
size_type
, even though this has no effect on generated code for values of N inside the range ofsize_type
.Note
For supplementary functionality, include:
corecel/math/ArrayUtils.hh
for real-number vector/matrix applicationscorecel/math/ArrayOperators.hh
for mathematical operatorsArrayIO.hh
for streaming and string conversionArrayIO.json.hh
for JSON input and output
Type aliases
-
using pointer = value_type*#
-
using const_pointer = value_type const*#
-
using reference = value_type&#
-
using const_reference = value_type const&#
-
using const_iterator = const_pointer#
Element access
-
inline const_reference operator[](size_type i) const#
-
inline const_reference front() const#
-
inline const_reference back() const#
-
inline const_pointer data() const#
Iterators
-
inline const_iterator begin() const#
-
inline const_iterator end() const#
-
inline const_iterator cbegin() const#
-
inline const_iterator cend() const#
-
template<class T, std::size_t Extent = dynamic_extent>
class Span# Modified backport of C++20 span.
Like the celeritas::Array , this class isn’t 100% compatible with the
std::span
class (partly of course because language features are missing from C++14). The hope is that it will be complete and correct for the use cases needed by Celeritas (and, as a bonus, it will be device-compatible).Notably, only a subset of the functions (those having to do with size) are
constexpr
. This is to allow debug assertions.Span can be instantiated with the special marker type
LdgValue<T>
to optimize reading constant data on device memory. In that case, data returned byfront
,back
,operator
[] andbegin
/end
iterator use value semantics instead of reference.data
still returns a pointer to the data and can be used to bypass usingLdgIterator
- Template Parameters:
T – value type
Extent – fixed size; defaults to dynamic.
Math, numerics, and algorithms#
Defines
-
CELER_WRAP_MATH_FLOAT_DBL_1(PREFIX, FUNC)#
Generate overloads for a single-argument math function.
-
CELER_WRAP_MATH_FLOAT_DBL_PTR_2(PREFIX, FUNC)#
-
namespace celeritas
Unnamed Group
-
void sincos(float a, float *s, float *c)#
-
void sincos(double a, double *s, double *c)#
-
void sincospi(float a, float *s, float *c)#
-
void sincospi(double a, double *s, double *c)#
Functions
-
template<class T>
T &&forward(typename std::remove_reference<T>::type &v) noexcept# Implement perfect forwarding with device-friendly functions.
-
template<class T>
auto move(T &&v) noexcept -> typename std::remove_reference<T>::type&&# Cast a value as an rvalue reference to allow move construction.
-
template<class InputIt, class Predicate>
inline bool all_of(InputIt iter, InputIt last, Predicate p)# Whether the predicate is true for all items.
-
template<class InputIt, class Predicate>
inline bool any_of(InputIt iter, InputIt last, Predicate p)# Whether the predicate is true for any item.
-
template<class T>
inline T const &clamp(T const &v, T const &lo, T const &hi)# Clamp the value between lo and hi values.
If the value is between lo and hi, return the value. Otherwise, return lo if it’s below it, or hi above it.
This replaces:
ormin(hi, max(lo, v))
assuming that the relationship betweenmax(v, min(v, lo))
lo
andhi
holds.
-
template<class T>
T clamp_to_nonneg(T v) noexcept# Return the value or (if it’s negative) then zero.
This is constructed to correctly propagate
NaN
.
-
template<class ForwardIt, class T, class Compare>
ForwardIt lower_bound(ForwardIt first, ForwardIt last, T const &value, Compare comp)# Find the insertion point for a value in a sorted list using a binary search.
-
template<class ForwardIt, class T>
ForwardIt lower_bound(ForwardIt first, ForwardIt last, T const &value)# Find the insertion point for a value in a sorted list using a binary search.
-
template<class ForwardIt, class T, class Compare>
ForwardIt lower_bound_linear(ForwardIt first, ForwardIt last, T const &value, Compare comp)# Find the insertion point for a value in a sorted list using a linear search.
-
template<class ForwardIt, class T>
ForwardIt lower_bound_linear(ForwardIt first, ForwardIt last, T const &value)# Find the insertion point for a value in a sorted list using a linear search.
-
template<class ForwardIt, class T, class Compare>
ForwardIt upper_bound(ForwardIt first, ForwardIt last, T const &value, Compare comp)# Find the first element which is greater than
-
template<class ForwardIt, class T>
ForwardIt upper_bound(ForwardIt first, ForwardIt last, T const &value)# Find the first element which is greater than
-
template<class ForwardIt, class Predicate>
ForwardIt partition(ForwardIt first, ForwardIt last, Predicate pred)# Partition elements in the given range, “true” before “false”.
This is done by swapping elements until the range is partitioned.
-
template<class RandomAccessIt, class Compare>
void sort(RandomAccessIt first, RandomAccessIt last, Compare comp)# Sort an array on a single thread.
This implementation is not thread-safe nor cooperative, but it can be called from CUDA code.
-
template<class RandomAccessIt>
void sort(RandomAccessIt first, RandomAccessIt last)# Sort an array on a single thread.
-
template<class T>
T const &max(T const &a, T const &b) noexcept# Return the higher of two values.
This function is specialized when building CUDA device code, which has special intrinsics for max.
-
template<class T>
T const &min(T const &a, T const &b) noexcept# Return the lower of two values.
This function is specialized when building CUDA device code, which has special intrinsics for min.
-
template<class ForwardIt, class Compare>
inline ForwardIt min_element(ForwardIt iter, ForwardIt last, Compare comp)# Return an iterator to the lowest value in the range as defined by Compare.
-
template<class ForwardIt>
ForwardIt min_element(ForwardIt first, ForwardIt last)# Return an iterator to the lowest value in the range.
-
template<unsigned int N, class T>
T ipow(T v) noexcept# Return an integer power of the input value.
Example:
assert(9.0 == ipow<2>(3.0)); assert(256 == ipow<8>(2)); static_assert(256 == ipow<8>(2));
-
template<class T, typename = std::enable_if_t<std::is_floating_point<T>::value>>
inline T fastpow(T a, T b)# Raise a number to a power with simplifying assumptions.
This should be faster than
std::pow
because we don’t worry about exceptions for zeros, infinities, or negative values for a.Example:
assert(9.0 == fastpow(3.0, 2.0));
-
inline double rsqrt(double value)#
Calculate an inverse square root.
-
inline float rsqrt(float value)#
Calculate an inverse square root.
Variables
-
template<class T = void>
struct Less# - #include <Algorithms.hh>
Evaluator for the first argument being less than the second.
-
template<>
struct Less<void># - #include <Algorithms.hh>
Specialization of less with template deduction.
-
void sincos(float a, float *s, float *c)#
Math functions using celeritas::Array.
-
namespace celeritas
Functions
-
template<class T, size_type N>
inline void axpy(T a, Array<T, N> const &x, Array<T, N> *y)# Increment a vector by another vector multiplied by a scalar.
-
template<class T, size_type N>
inline T dot_product(Array<T, N> const &x, Array<T, N> const &y)# Dot product of two vectors.
-
template<class T>
inline Array<T, 3> cross_product(Array<T, 3> const &x, Array<T, 3> const &y)# Cross product of two space vectors.
-
template<class T, size_type N>
inline T norm(Array<T, N> const &v)# Calculate the Euclidian (2) norm of a vector.
-
template<class T, size_type N>
inline Array<T, N> make_unit_vector(Array<T, N> const &v)# Construct a unit vector.
Unit vectors have an Euclidian norm magnitude of 1.
-
template<class T, size_type N>
inline T distance(Array<T, N> const &x, Array<T, N> const &y)# Calculate the Euclidian (2) distance between two points.
-
template<class T>
inline Array<T, 3> from_spherical(T costheta, T phi)# Calculate a Cartesian vector from spherical coordinates.
Theta is the angle between the Z axis and the outgoing vector, and phi is the angle between the x axis and the projection of the vector onto the x-y plane.
-
template<class T>
inline Array<T, 3> rotate(Array<T, 3> const &dir, Array<T, 3> const &rot)# Rotate the direction about the given Z-based scatter direction.
The equivalent to calling the Shift transport code’s
is the callvoid cartesian_vector_transform( double costheta, double phi, Vector_View vector);
vector = rotate(from_spherical(costheta, phi), vector);
This code effectively decomposes the given rotation vector
rot
into two sequential transform matrices, one with an angle theta about the y axis and one about phi rotating around the z axis. These two angles are the spherical coordinate transform of the givenrot
cartesian direction vector.There is some extra code in here to deal with loss of precision when the incident direction is along the z axis. As
rot
approaches z, the azimuthal angle phi must be calculated carefully from both the x and y components of the vector, not independently. Ifrot
actually equals z then the azimuthal angle is completely indeterminate so we arbitrarily choosephi
= 0.This function is often used for calculating exiting scattering angles. In that case,
dir
is the exiting angle from the scattering calculation, androt
is the original direction of the particle. The direction vectors are defined as\[ \Omega = \sin\theta\cos\phi\mathbf{i} + \sin\theta\sin\phi\mathbf{j} + \cos\theta\mathbf{k} \,. \]
-
template<class T, size_type N>
inline bool is_soft_unit_vector(Array<T, N> const &v)# Test for being approximately a unit vector.
Consider a unit vector v with a small perturbation along a unit vector e :
\[ \vec v + \epsilon \vec e \]\[ m^2 = (v + \epsilon e) \cdot (v + \epsilon e) = v \cdot v + 2 \epsilon v \cdot e + \epsilon^2 e \cdot e = 1 + 2 \epsilon v \cdot e + \epsilon^2 \]Since
\[ |v \cdot e| <= |v||e| = 1 \]\[ m^2 = 1 \pm 2 \epsilon + \epsilon^2 \]Instead of calculating the square of the tolerance we loosely bound with another epsilon.
Example:
CELER_EXPECT(is_soft_unit_vector(v));
-
template<class T, size_type N>
Atomics for use in kernel code (CUDA/HIP/OpenMP).
-
namespace celeritas
-
template<class Numeric>
struct numeric_limits# Subset of numeric limits compatible with both host and device.
Note
CUDART_NAN
andCUDART_INF
are notconstexpr
in CUDA 10 at least, so we have replaced those with compiler built-ins that work in GCC, Clang, and MSVC.
Unnamed Group
-
CELER_DEFINE_QACCESS(FUNC, QUAL)#
Access the underlying numeric value, discarding units
-
namespace celeritas
Functions
-
auto zero_quantity() noexcept#
Get a zero quantity (analogous to nullptr).
-
auto max_quantity() noexcept#
Get a quantitity greater than any other numeric quantity.
-
auto neg_max_quantity() noexcept#
Get a quantitity less than any other numeric quantity.
-
template<class U, class V>
void swap(Quantity<U, V> &a, Quantity<U, V> &b) noexcept# Swap two Quantities.
-
template<class UnitT, class ValueT>
auto native_value_from(Quantity<UnitT, ValueT> quant) noexcept -> decltype(auto)# Convert the given quantity into the native Celeritas unit system.
assert(native_value_from(Quantity<CLight>{1}) == 2.998e10 * centimeter/second);
-
template<class Q>
Q native_value_to(typename Q::value_type value) noexcept# Create a quantity from a value in the Celeritas unit system.
This function can be used for defining a constant for use in another unit system (typically a “natural” unit system for use in physics kernels).
constexpr LightSpeed c = native_value_to<LightSpeed>(constants::c_light); assert(c.value() == 1);
-
template<class Q, class SrcUnitT, class ValueT>
auto value_as(Quantity<SrcUnitT, ValueT> quant) noexcept -> ValueT# Use the value of a Quantity.
The redundant unit type in the function signature is to make coupling safer across different parts of the code and to make the user code more readable.
assert(value_as<LightSpeed>(LightSpeed{1}) == 1);
-
template<class T>
inline char const *accessor_unit_label()# Get the label for a unit returned from a class accessor.
Example:
cout << accessor_unit_label<&ParticleView::mass>() << endl;
-
template<class UnitT, class ValueT = decltype(UnitT::value())>
class Quantity# - #include <Quantity.hh>
A numerical value tagged with a unit.
A quantity is a value expressed in terms of the given unit. Storing values in a different unit system can help with some calculations (e.g. operating in natural unit systems) by avoiding numerical multiplications and divisions by large constants. It can also make debugging easier (numeric values are obvious).
Example usage by physics class, where charge is in units of q_e+, and mass and momentum are expressed in atomic natural units (where m_e = 1 and c = 1).
using MevEnergy = Quantity<Mev>; using MevMass = Quantity<UnitDivide<Mev, CLightSq>>; using MevMomentum = Quantity<UnitDivide<Mev, CLight>>;
A relativistic equation that operates on these quantities can do so without unnecessary floating point operations involving the speed of light:
The resulting quantity can be converted to the native Celeritas unit system withreal_type eval = value_as<MevEnergy>(energy); // Natural units MevMomentum momentum{std::sqrt(eval * eval + 2 * value_as<MevMass>(mass) * eval)};
native_value_from
, which multiplies in the constant value of ElMomentumUnit:real_type mom = native_value_from(momentum);
When using a Quantity from another part of the code, e.g. an imported unit system, use the
quantity
free function rather than.value() in order to guarantee consistency of units between source and destination.
An example unit class would be:
struct DozenUnit { static constexpr int value() { return 12; } static constexpr char const* label() { return "dozen"; } };
The label is used solely for outputting to JSON.
Note
The Quantity is designed to be a simple “strong type” class, not a complex mathematical class. To operate on quantities, you must use
value_as
(to operate within the Quantity’s unit system) ornative_value_from
(to operate in the Celeritas native unit system), use the resulting numeric values in your mathematical expressions, then return a new Quantity class with the resulting value and correct type.- Template Parameters:
UnitT – unit tag class
ValueT – value type
Unnamed Group
-
inline value_type &value() & noexcept#
-
inline value_type const &value() const & noexcept#
Public Functions
-
constexpr Quantity() = default#
Construct with default (zero)
-
inline explicit Quantity(value_type value) noexcept#
Construct with value in celeritas native units.
Private Members
-
value_type value_ = {}#
-
namespace detail
Variables
-
template<class T>
constexpr bool is_quantity_v = IsQuantity<T>::value#
-
template<class T>
-
auto zero_quantity() noexcept#
-
namespace celeritas
Functions
-
real_type sqrt_tol()#
Square root of the soft equivalence tolerance for Celeritas.
This tolerance is needed for operations where the accuracy is limited by the square root of machine precision.
- Todo:
Move orange tolerance and related operations into corecel/math alongside this, revisit ArrayUtils soft comparisons.
-
template<class RealType>
inline bool soft_equal(RealType expected, RealType actual)# Soft equivalence with default tolerance.
-
template<class RealType>
inline bool soft_near(RealType expected, RealType actual, RealType rel_error)# Soft equivalence with relative error.
-
template<class F>
class EqualOr : public F - #include <SoftEqual.hh>
Compare for equality before checking with the given functor.
This CRTP class allows
SoftEqual
to work for infinities.
-
template<class RealType = ::celeritas::real_type>
class SoftEqual# - #include <SoftEqual.hh>
Functor for noninfinite floating point equality.
This function-like class considers an absolute tolerance for values near zero, and a relative tolerance for values far from zero. It correctly returns “false” if either value being compared is NaN. The call operator is commutative:
eq(a,b)
should always give the same aseq(b,a)
.The actual comparison is:
\[ |a - b| < \max(\epsilon_r \max(|a|, |b|), \epsilon_a) \]Note
The edge case where both values are infinite (with the same sign) returns false for equality, which could be considered reasonable because relative error is meaningless. To explicitly allow infinities to compare equal, you must test separately, e.g.,
a == b || soft_eq(a, b)
.Public Functions
-
SoftEqual()#
Construct with default relative/absolute precision.
-
explicit SoftEqual(value_type rel)#
Construct with scaled absolute precision.
-
SoftEqual(value_type rel, value_type abs)#
Construct with both relative and absolute precision.
- Parameters:
rel – tolerance of relative error (default 1.0e-12 for doubles)
abs – threshold for absolute error when comparing small quantities (default 1.0e-14 for doubles)
-
bool operator()(value_type a, value_type b) const#
Compare two values, implicitly casting arguments.
-
inline value_type rel() const#
Relative allowable error.
-
inline value_type abs() const#
Absolute tolerance.
Private Types
-
using SETraits = detail::SoftEqualTraits<value_type>#
-
SoftEqual()#
-
template<class RealType = ::celeritas::real_type>
class SoftZero# - #include <SoftEqual.hh>
Functor for floating point equality.
Public Functions
-
SoftZero()#
Construct with default relative/absolute precision.
-
explicit SoftZero(value_type abs)#
Construct with specified precision.
- Parameters:
abs – threshold for absolute error (default 1.0e-14 for doubles)
-
inline bool operator()(value_type actual) const#
See if the value is within absolute tolerance of zero.
- Parameters:
actual – scalar floating point value
-
inline value_type abs() const#
Absolute tolerance.
Private Types
-
using SETraits = detail::SoftEqualTraits<value_type>#
Private Members
-
value_type abs_#
-
SoftZero()#
-
real_type sqrt_tol()#
I/O#
Defines
-
CELER_CODE_PROVENANCE#
Inject the source code provenance (current file and line)
-
CELER_LOG(LEVEL)#
Return a LogMessage object for streaming into at the given level.
The regular
CELER_LOG
call is for code paths that happen uniformly in parallel.The logger will only format and print messages. It is not responsible for cleaning up the state or exiting an app.
CELER_LOG(debug) << "Don't print this in general"; CELER_LOG(warning) << "You may want to reconsider your life choices"; CELER_LOG(critical) << "Caught a fatal exception: " << e.what();
-
CELER_LOG_LOCAL(LEVEL)#
Like
CELER_LOG
but for code paths that may only happen on a single process.Use sparingly.
-
namespace celeritas
-
class Logger#
- #include <Logger.hh>
Manage logging in serial and parallel.
This should generally be called by the
world_logger
andself_logger
functions below. The calloperator()
returns an object that should be streamed into in order to create a log message.This object is assignable, so to replace the default log handler with a different one, you can call
world_logger = Logger(MpiCommunicator::comm_world(), my_handler);
Public Functions
-
explicit Logger(LogHandler handle)#
Construct with default communicator and handler.
-
Logger(MpiCommunicator const &comm, LogHandler handle)#
Construct with communicator (only rank zero is active) and handler.
-
inline Message operator()(LogProvenance &&prov, LogLevel lev)#
Create a logger that flushes its contents when it destructs.
It’s assumed that log messages will be relatively unlikely (and expensive anyway), so we mark as
CELER_UNLIKELY
to optimize for the no-logging case.
-
inline void level(LogLevel lev)#
Set the minimum logging verbosity.
-
inline LogLevel level() const#
Get the current logging verbosity.
Public Static Functions
-
static inline constexpr LogLevel default_level()#
Get the default log level.
-
explicit Logger(LogHandler handle)#
-
class Logger#
-
class OutputInterface#
Pure abstract interface for writing metadata output to JSON.
At the end of the program/run, the OutputRegistry will call the “output” method on all interfaces.
- Todo:
Perhaps another output method for saving a schema?
Subclassed by celeritas::ActionDiagnostic, celeritas::ActionRegistryOutput, celeritas::BuildOutput, celeritas::ExceptionOutput, celeritas::GeantSimpleCalo, celeritas::GeantStepDiagnostic, celeritas::GeoParamsOutput, celeritas::MaterialParamsOutput, celeritas::OrangeParamsOutput, celeritas::OutputInterfaceAdapter< T >, celeritas::ParticleParamsOutput, celeritas::PhysicsParamsOutput, celeritas::SimpleCalo, celeritas::StepDiagnostic, celeritas::VecgeomParamsOutput
Public Types