Numerics¶
These functions replace and extend those in the C++ standard library
<cmath>
, <numeric>
, and <atomic>
headers.
Math¶
Direct replacements for <cmath>
and <limits>
:
-
template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
T celeritas::fma(T a, T b, T y)¶ Use fused multiply-add for generic calculations.
This provides a floating point specialization so that
fma
can be used in code that is accelerated for floating point calculations but still works correctly with integer arithmetic.Because of the single template parameter, it may be easier to use
std::fma
directly in most cases.
-
template<class Numeric>
struct numeric_limits¶ Subset of numeric limits compatible with both host and device.
- Deprecated:
The name of this class will change to NumericLimits to conform to the Celeritas naming system, since it is not a complete replacement for the
std
implementation.
Note
CUDART_NAN
andCUDART_INF
are notconstexpr
in CUDA 10 at least, so we have replaced those with compiler built-ins that work in GCC, Clang, and MSVC.
Portable replacements for math functions provided by the CUDA/HIP libraries:
-
double celeritas::rsqrt(double value)¶
Calculate an inverse square root.
-
void celeritas::sincos(double a, double *s, double *c)¶
Simultaneously evaluate the sine and cosine of a value.
-
void celeritas::sincospi(double a, double *s, double *c)¶
Simultaneously evaluate the sine and cosine of a value factored by pi.
Additional utility functions:
-
template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
inline T celeritas::fastpow(T a, T b)¶ Raise a number to a power with simplifying assumptions.
This should be faster than
std::pow
because we don’t worry about special cases for zeros, infinities, or negative values fora
.Example:
assert(9.0 == fastpow(3.0, 2.0));
-
template<class T>
T celeritas::ceil_div(T top, T bottom)¶ Integer division, rounding up, for positive numbers.
-
template<class T>
T celeritas::clamp_to_nonneg(T v) noexcept¶ Return the value or (if it’s negative) then zero.
This is constructed to propagate
NaN
.
-
template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
T celeritas::eumod(T num, T denom)¶ Calculate the Euclidean modulus of two numbers.
num numerator
denom denominator
This function is useful for normalizing user-provided angles. Examples:
eumod(3, 2) == 1 eumod(-0.5, 2) == 1.5 eumod(-2, 2) == 0
-
template<unsigned int N, class T>
T celeritas::ipow(T v) noexcept¶ Return a nonnegative integer power of the input value.
Example:
assert(9.0 == ipow<2>(3.0)); assert(256 == ipow<8>(2)); static_assert(256 == ipow<8>(2));
-
template<class T>
int celeritas::signum(T x)¶ Calculate the sign of a number.
- Returns:
-1 if negative, 0 if exactly zero (or NaN), 1 if positive
Several standalone classes can be used to evaluate, integrate, and solver expressions and functions.
-
template<class T, size_type N>
class PolyEvaluator¶ Functor class to evaluate a polynomial.
This is an efficient and foolproof way of storing and evaluating a polynomial expansion:
\[ f(x) = a_0 + x * (a_1 + x * (a_2 + ...)) \]It replaces opaque expressions such as:
withcorr = (zeff * (real_type(1.84035e-4) * zeff - real_type(1.86427e-2)) + real_type(1.41125));
or, to use an explicit type without having to cast each coefficient:corr = PolyEvaluator{1.41125, -1.86427e-2, 1.84035e-4}(zeff);
using PolyQuad = PolyEvaluator<real_type, N>; corr = PolyQuad{1.41125, -1.86427e-2, 1.84035e-4)(zeff);
-
template<class F>
class Integrator¶ Perform numerical integration of a generic 1-D function.
Currently this is a simple nonrecursive Newton-Coates integrator extracted from NuclearZoneBuilder. It should be improved for robustness, accuracy, and efficiency, probably by using a Gauss-Legendre quadrature.
This class is to be used only during setup.
-
template<class F>
class BisectionRootFinder¶ Find a local root using the bisection algorithm.
Using a left and right bound a Bisection approximates the root as:
\[ root = 0.5 * (left + right) \]Then value of func at the root is calculated compared to values of func at the bounds. The root is then used update the bounds based on the sign of func(root) and whether it matches the sign of func(left) or func(right) . Performing this update of the bounds allows for the iteration on root, using the convergence criteria based on func(root) proximity to 0.
-
template<class F>
class IllinoisRootFinder¶ Solve for a single local root using the Illinois method.
Perform Regula Falsi (see celeritas::RegulaFalsiSolver for more details) iterations given a root function func and tolerance tol using the Illinois method.
Illinois method modifies the standard approach by comparing the sign of func(root) approximation in the current iteration with the previous approximation. If both iterations are on the same side then the func at the bound on the other side is halved.
-
class TridiagonalSolver¶
Solve a tridiagonal system of equations using the Thomas algorithm.
This is a simplified form of Gaussian elimination that can solve a tridiagonal system \( \mathbf{T} \mathbf{x} = \mathbf{b} \) in O(n) time.
The class is meant for use during setup (originally for the calculation of spline coefficients) and cannot be used on device.
Atomics¶
These atomic functions are for use in kernel code (CUDA/HIP/OpenMP) that use track-level parallelism.
-
template<class T>
T celeritas::atomic_add(T *address, T value)¶ Add to a value, returning the original value.
Array utilities¶
These operate on fixed-size arrays of data (see Containers), usually
Real3
as a Cartesian spatial coordinate.
-
template<class T, size_type N>
inline void celeritas::axpy(T a, Array<T, N> const &x, Array<T, N> *y)¶ Increment a vector by another vector multiplied by a scalar.
\[ y \gets \alpha x + y \]Note that this uses
celeritas::fma
which supports types other than floating point.
-
template<class T, size_type N>
inline T celeritas::dot_product(Array<T, N> const &x, Array<T, N> const &y)¶ Dot product of two vectors.
Note that this uses
celeritas::fma
which supports types other than floating point.
-
template<class T>
inline Array<T, 3> celeritas::cross_product(Array<T, 3> const &x, Array<T, 3> const &y)¶ Cross product of two space vectors.
-
template<class T, size_type N>
inline T celeritas::norm(Array<T, N> const &v)¶ Calculate the Euclidean (2) norm of a vector.
-
template<class T, size_type N>
inline Array<T, N> celeritas::make_orthogonal(Array<T, N> const &x, Array<T, N> const &y)¶ Return the component of x that is orthogonal to the unit vector y.
In this implementation, y must be normalized, and the result is not normalized.
\[ \mathbf{x}' \gets \mathbf{x} - (\mathbf{x} \cdot \mathbf{y}) \mathbf{y} \, , \quad \|\mathbf{y}\| = 1 \]
-
template<class T, size_type N>
inline Array<T, N> celeritas::make_unit_vector(Array<T, N> const &v)¶ Construct a unit vector.
Unit vectors have an Euclidean norm magnitude of 1.
-
template<class T, size_type N>
inline T celeritas::distance(Array<T, N> const &x, Array<T, N> const &y)¶ Calculate the Euclidean (2) distance between two points.
-
template<class T>
inline Array<T, 3> celeritas::from_spherical(T costheta, T phi)¶ Calculate a Cartesian vector from spherical coordinates.
Theta is the angle between the z axis and the outgoing vector, and
phi
is the angle between the x axis and the projection of the vector onto the x-y plane.
-
template<class T>
inline Array<T, 3> celeritas::rotate(Array<T, 3> const &dir, Array<T, 3> const &rot)¶ Rotate a direction about the given scattering direction.
The equivalent to calling the Shift transport code’s
is the callvoid cartesian_vector_transform( double costheta, double phi, Vector_View vector);
vector = rotate(from_spherical(costheta, phi), vector);
This code effectively decomposes the given rotation vector
rot
into two sequential transform matrices, one with an angle theta about the y axis and one about phi rotating around the z axis. These two angles are the spherical coordinate transform of the givenrot
cartesian direction vector.There is some extra code in here to deal with loss of precision when the incident direction is along the z axis. As
rot
approaches z, the azimuthal angle \( \phi \) must be calculated carefully from both the x and y components of the vector, not independently. Ifrot
actually equals z then the azimuthal angle is completely indeterminate so we arbitrarily choose \( \phi = 0 \).This function is often used for calculating exiting scattering angles. In that case,
dir
is the exiting angle from the scattering calculation, androt
is the original direction of the particle. The direction vectors are defined as\[ \vec{\Omega} = \sin\theta\cos\phi\vec{i} + \sin\theta\sin\phi\vec{j} + \cos\theta\vec{k} \,. \]
Soft equivalence¶
These utilities are used for comparing real-valued numbers to a given tolerance.
-
template<class RealType = ::celeritas::real_type>
class SoftEqual¶ Functor for noninfinite floating point equality.
This function-like class considers an absolute tolerance for values near zero, and a relative tolerance for values far from zero. It correctly returns “false” if either value being compared is NaN. The call operator is commutative:
eq(a,b)
should always give the same aseq(b,a)
.The actual comparison is:
\[ |a - b| < \max(\epsilon_r \max(|a|, |b|), \epsilon_a) \]Note
The edge case where both values are infinite (with the same sign) returns false for equality, which could be considered reasonable because relative error is meaningless. To explicitly allow infinities to compare equal, you must test separately, e.g.,
a == b || soft_eq(a, b)
usingceleritas::EqualOr
.
-
template<class RealType = ::celeritas::real_type>
class SoftZero¶ Functor for comparing values to zero with an absolute precision.
-
template<class F>
class EqualOr : public F¶ Compare for equality before checking with the given functor.
This CRTP class allows
SoftEqual
to work for infinities.
-
template<class T = ::celeritas::real_type>
class ArraySoftUnit¶ Test for being approximately a unit vector.
Consider a unit vector v with a small perturbation along a unit vector e :
\[ \vec v + \epsilon \vec e \]\[ m^2 = (v + \epsilon e) \cdot (v + \epsilon e) = 1 + 2 (v \cdot e) \epsilon + \epsilon^2 \]Since by the triangle inequality
\[ |v \cdot e| <= |v||e| = 1 \]\[ m^2 = 1 \pm 2 \epsilon + \epsilon^2 \]Instead of calculating the square of the tolerance we use \( \epsilon^2 < \epsilon \) to make the “soft unit vector” condition
\[ | \vec v \vd \vec v - 1 | < 3 \epsilon . \]