Standard library replacements¶
These functions replace or extend those in the C++ standard library but work in
GPU code without the special --expt-relaxed-constexpr
flag. None of these
algorithms are thread-cooperative, so all can be used in standard Celeritas
kernels.
Utilities¶
A few utilities replace and supplement the standard <utility>
header.
-
template<class T>
T &&celeritas::forward(typename std::remove_reference<T>::type &v) noexcept¶ Implement perfect forwarding with device-friendly functions.
-
template<class T>
auto celeritas::move(T &&v) noexcept -> typename std::remove_reference<T>::type&&¶ Cast a value as an rvalue reference to allow move construction.
Algorithms¶
These device-compatible functions replace or extend those in the C++ standard
library <algorithm>
header. The implementations of sort
and other
partitioning elements are derived from LLVM’s libc++
.
-
template<class InputIt, class Predicate>
inline bool celeritas::all_of(InputIt iter, InputIt last, Predicate p)¶ Whether the predicate is true for all items.
-
template<class InputIt, class Predicate>
inline bool celeritas::any_of(InputIt iter, InputIt last, Predicate p)¶ Whether the predicate is true for any item.
-
template<class T>
inline T const &celeritas::clamp(T const &v, T const &lo, T const &hi)¶ Clamp the value between lo and hi values.
If the value is between lo and hi, return the value. Otherwise, return lo if it’s below it, or hi above it.
This replaces:
ormin(hi, max(lo, v))
assuming that the relationship betweenmax(v, min(v, lo))
lo
andhi
holds.This is constructed to propagate
NaN
.
-
template<class ForwardIt, class T, class Compare>
ForwardIt celeritas::lower_bound(ForwardIt first, ForwardIt last, T const &value, Compare comp)¶ Find the insertion point for a value in a sorted list using a binary search.
-
template<class T, std::enable_if_t<!std::is_floating_point<T>::value, bool> = true>
T const &celeritas::max(T const &a, T const &b) noexcept¶ Return the higher of two values.
This function is specialized so that floating point types use
std::fmax
for better performance on GPU and ARM.
-
template<class T, std::enable_if_t<!std::is_floating_point<T>::value, bool> = true>
T const &celeritas::min(T const &a, T const &b) noexcept¶ Return the lower of two values.
This function is specialized so that floating point types use
std::fmin
for better performance on GPU and ARM.
-
template<class ForwardIt, class Compare>
inline ForwardIt celeritas::min_element(ForwardIt iter, ForwardIt last, Compare comp)¶ Return an iterator to the lowest value in the range as defined by Compare.
-
template<class ForwardIt, class Predicate>
ForwardIt celeritas::partition(ForwardIt first, ForwardIt last, Predicate pred)¶ Partition elements in the given range, “true” before “false”.
This is done by swapping elements until the range is partitioned.
-
template<class RandomAccessIt, class Compare>
void celeritas::sort(RandomAccessIt first, RandomAccessIt last, Compare comp)¶ Sort an array on a single thread.
This implementation is not thread-safe nor cooperative, but it can be called from CUDA code.
-
template<class ForwardIt, class T, class Compare>
ForwardIt celeritas::upper_bound(ForwardIt first, ForwardIt last, T const &value, Compare comp)¶ Find the first element which is greater than
.
A few convenience algorithms are built on top of these replacements:
-
template<class InputIt, class Predicate>
inline bool celeritas::all_adjacent(InputIt iter, InputIt last, Predicate p)¶ Whether the predicate is true for pairs of consecutive items.