Numerics

These functions replace and extend those in the C++ standard library <cmath>, <numeric>, and <atomic> headers.

Math

Direct replacements for <cmath> and <limits>:

template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
T celeritas::fma(T a, T b, T y)

Use fused multiply-add for generic calculations.

This provides a floating point specialization so that fma can be used in code that is accelerated for floating point calculations but still works correctly with integer arithmetic.

Because of the single template parameter, it may be easier to use std::fma directly in most cases.

template<class Numeric>
struct numeric_limits

Subset of numeric limits compatible with both host and device.

Deprecated:

The name of this class will change to NumericLimits to conform to the Celeritas naming system, since it is not a complete replacement for the std implementation.

Note

CUDART_NAN and CUDART_INF are not constexpr in CUDA 10 at least, so we have replaced those with compiler built-ins that work in GCC, Clang, and MSVC.

Portable replacements for math functions provided by the CUDA/HIP libraries:

template<class T>
int celeritas::popcount(T x) noexcept

Count the number of set bits in an integer.

double celeritas::rsqrt(double value)

Calculate an inverse square root.

void celeritas::sincos(double a, double *s, double *c)

Simultaneously evaluate the sine and cosine of a value.

void celeritas::sincospi(double a, double *s, double *c)

Simultaneously evaluate the sine and cosine of a value factored by pi.

Additional utility functions:

template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
inline T celeritas::fastpow(T a, T b)

Raise a number to a power with simplifying assumptions.

This should be faster than std::pow because we don’t worry about special cases for zeros, infinities, or negative values for a.

Example:

assert(9.0 == fastpow(3.0, 2.0));

template<class T>
T celeritas::ceil_div(T top, T bottom)

Integer division, rounding up, for positive numbers.

template<class T>
T celeritas::clamp_to_nonneg(T v) noexcept

Return the value or (if it’s negative) then zero.

This is constructed to propagate NaN.

template<class T, std::enable_if_t<std::is_floating_point<T>::value, bool> = true>
T celeritas::eumod(T num, T denom)

Calculate the Euclidean modulus of two numbers.

  • num numerator

  • denom denominator

If both numbers are positive, this should be the same as fmod. If the sign of the remainder and denominator don’t match, the remainder will be remapped so that it is between zero and the denominator.

This function is useful for normalizing user-provided angles. Examples:

eumod(3, 2) == 1
eumod(-0.5, 2) == 1.5
eumod(-2, 2) == 0

template<unsigned int N, class T>
T celeritas::ipow(T v) noexcept

Return a nonnegative integer power of the input value.

Example:

assert(9.0 == ipow<2>(3.0));
assert(256 == ipow<8>(2));
static_assert(256 == ipow<8>(2));

template<class T>
T celeritas::negate(T value)

Negation that won’t return signed zeros.

template<class T>
int celeritas::signum(T x)

Calculate the sign of a number.

Returns:

-1 if negative, 0 if exactly zero (or NaN), 1 if positive

Several standalone classes can be used to evaluate, integrate, and solver expressions and functions.

template<class T, size_type N>
class PolyEvaluator

Functor class to evaluate a polynomial.

This is an efficient and foolproof way of storing and evaluating a polynomial expansion:

\[ f(x) = a_0 + x * (a_1 + x * (a_2 + ...)) \]

It replaces opaque expressions such as:

corr = (zeff * (real_type(1.84035e-4) * zeff - real_type(1.86427e-2))
         + real_type(1.41125));
with
corr = PolyEvaluator{1.41125, -1.86427e-2, 1.84035e-4}(zeff);
or, to use an explicit type without having to cast each coefficient:
using PolyQuad = PolyEvaluator<real_type, N>;
corr = PolyQuad{1.41125, -1.86427e-2, 1.84035e-4)(zeff);

template<class F>
class Integrator

Perform numerical integration of a generic 1-D function.

Currently this is a simple nonrecursive Newton-Coates integrator extracted from NuclearZoneBuilder. It should be improved for robustness, accuracy, and efficiency, probably by using a Gauss-Legendre quadrature.

This class is to be used only during setup.

template<class F>
class BisectionRootFinder

Find a local root using the bisection algorithm.

Using a left and right bound a Bisection approximates the root as:

\[ root = 0.5 * (left + right) \]

Then value of func at the root is calculated compared to values of func at the bounds. The root is then used update the bounds based on the sign of func(root) and whether it matches the sign of func(left) or func(right) . Performing this update of the bounds allows for the iteration on root, using the convergence criteria based on func(root) proximity to 0.

template<class F>
class IllinoisRootFinder

Solve for a single local root using the Illinois method.

Perform Regula Falsi (see celeritas::RegulaFalsiSolver for more details) iterations given a root function func and tolerance tol using the Illinois method.

Illinois method modifies the standard approach by comparing the sign of func(root) approximation in the current iteration with the previous approximation. If both iterations are on the same side then the func at the bound on the other side is halved.

class TridiagonalSolver

Solve a tridiagonal system of equations using the Thomas algorithm.

This is a simplified form of Gaussian elimination that can solve a tridiagonal system \( \mathbf{T} \mathbf{x} = \mathbf{b} \) in O(n) time.

The class is meant for use during setup (originally for the calculation of spline coefficients) and cannot be used on device.

Atomics

These atomic functions are for use in kernel code (CUDA/HIP/OpenMP) that use track-level parallelism.

template<class T>
T celeritas::atomic_add(T *address, T value)

Add to a value, returning the original value.

template<class T>
T celeritas::atomic_min(T *address, T value)

Set the value to the minimum of the actual and given, returning old.

template<class T>
T celeritas::atomic_max(T *address, T value)

Set the value to the maximum of the actual and given, returning old.

Array utilities

These operate on fixed-size arrays of data (see Containers), usually Real3 as a Cartesian spatial coordinate.

using celeritas::Real3 = Array<real_type, 3>

Three-dimensional cartesian coordinates.

template<class T, size_type N>
inline void celeritas::axpy(T a, Array<T, N> const &x, Array<T, N> *y)

Increment a vector by another vector multiplied by a scalar.

\[ y \gets \alpha x + y \]

Note that this uses celeritas::fma which supports types other than floating point.

template<class T, size_type N>
inline T celeritas::dot_product(Array<T, N> const &x, Array<T, N> const &y)

Dot product of two vectors.

Note that this uses celeritas::fma which supports types other than floating point.

template<class T>
inline Array<T, 3> celeritas::cross_product(Array<T, 3> const &x, Array<T, 3> const &y)

Cross product of two space vectors.

template<class T, size_type N>
inline T celeritas::norm(Array<T, N> const &v)

Calculate the Euclidean (2) norm of a vector.

template<class T, size_type N>
inline Array<T, N> celeritas::make_orthogonal(Array<T, N> const &x, Array<T, N> const &y)

Return the component of x that is orthogonal to the unit vector y.

In this implementation, y must be normalized, and the result is not normalized.

\[ \mathbf{x}' \gets \mathbf{x} - (\mathbf{x} \cdot \mathbf{y}) \mathbf{y} \, , \quad \|\mathbf{y}\| = 1 \]

template<class T, size_type N>
inline Array<T, N> celeritas::make_unit_vector(Array<T, N> const &v)

Construct a unit vector.

Unit vectors have an Euclidean norm magnitude of 1.

template<class T, size_type N>
inline T celeritas::distance(Array<T, N> const &x, Array<T, N> const &y)

Calculate the Euclidean (2) distance between two points.

template<class T>
inline Array<T, 3> celeritas::from_spherical(T costheta, T phi)

Calculate a Cartesian vector from spherical coordinates.

Theta is the angle between the z axis and the outgoing vector, and phi is the angle between the x axis and the projection of the vector onto the x-y plane.

template<class T>
inline Array<T, 3> celeritas::rotate(Array<T, 3> const &dir, Array<T, 3> const &rot)

Rotate a direction about the given scattering direction.

The equivalent to calling the Shift transport code’s

void cartesian_vector_transform(
    double      costheta,
    double      phi,
    Vector_View vector);
is the call
vector = rotate(from_spherical(costheta, phi), vector);

This code effectively decomposes the given rotation vector rot into two sequential transform matrices, one with an angle theta about the y axis and one about phi rotating around the z axis. These two angles are the spherical coordinate transform of the given rot cartesian direction vector.

There is some extra code in here to deal with loss of precision when the incident direction is along the z axis. As rot approaches z, the azimuthal angle \( \phi \) must be calculated carefully from both the x and y components of the vector, not independently. If rot actually equals z then the azimuthal angle is completely indeterminate so we arbitrarily choose \( \phi = 0 \).

This function is often used for calculating exiting scattering angles. In that case, dir is the exiting angle from the scattering calculation, and rot is the original direction of the particle. The direction vectors are defined as

\[ \vec{\Omega} = \sin\theta\cos\phi\vec{i} + \sin\theta\sin\phi\vec{j} + \cos\theta\vec{k} \,. \]

Soft equivalence

These utilities are used for comparing real-valued numbers to a given tolerance.

template<class RealType = ::celeritas::real_type>
class SoftEqual

Functor for noninfinite floating point equality.

This function-like class considers an absolute tolerance for values near zero, and a relative tolerance for values far from zero. It correctly returns “false” if either value being compared is NaN. The call operator is commutative: eq(a,b) should always give the same as eq(b,a).

The actual comparison is:

\[ |a - b| < \max(\epsilon_r \max(|a|, |b|), \epsilon_a) \]

Note

The edge case where both values are infinite (with the same sign) returns false for equality, which could be considered reasonable because relative error is meaningless. To explicitly allow infinities to compare equal, you must test separately, e.g., a == b || soft_eq(a, b) using celeritas::EqualOr.

template<class RealType = ::celeritas::real_type>
class SoftZero

Functor for comparing values to zero with an absolute precision.

template<class F>
class EqualOr : public F

Compare for equality before checking with the given functor.

This CRTP class allows SoftEqual to work for infinities.

template<class T = ::celeritas::real_type>
class ArraySoftUnit

Test for being approximately a unit vector.

Consider a unit vector v with a small perturbation along a unit vector e :

\[ \vec v + \epsilon \vec e \]
The magnitude of this “nearly unit” is
\[ m^2 = (v + \epsilon e) \cdot (v + \epsilon e) = 1 + 2 (v \cdot e) \epsilon + \epsilon^2 \]

Since by the triangle inequality

\[ |v \cdot e| <= |v||e| = 1 \]
, then the magnitude squared of a perturbed unit vector is bounded
\[ m^2 = 1 \pm 2 \epsilon + \epsilon^2 \]

Instead of calculating the square of the tolerance we use \( \epsilon^2 < \epsilon \) to make the “soft unit vector” condition

\[ | \vec v \vd \vec v - 1 | < 3 \epsilon . \]

template<class T, size_type N>
inline bool celeritas::is_soft_orthogonal(Array<T, N> const &x, Array<T, N> const &y)

Check whether two vectors are approximately orthogonal.

Note that the test for orthogonality should use relative tolerance, not absolute.