Type Casting¶
The saturate_cast
family of function templates
provide safe, saturating type casting for both GPU and CPU.
Any combination of augpy dtypes can be cast.
Always instantiate the template with both types to ensure that the correct implementation is used.
Cuda type cast intrinsics are used where possible,
otherwise min
and max
math functions are used.
Some type combinations use if-the-else branches on CPU.
-
template<typename
input_t
, typenameoutput_t
>
__device__ __host__ __forceinline__ voidsaturate_cast
(input_t vin, output_t *vout)¶ Cast a
input_t
value tooutput_t
and write tovout
pointer. The cast is done with saturation, meaning that no under or overflow will occur.It is ensured that, for any casted value pairs \((v_{in}, v_{out})\) and \((v_{in}', v_{out}')\), if \(v_{in} \le (v_{in}'\), then \(v_{out} \le (v_{out}'\). Similarly, if \(v_{in} \ge (v_{in}'\), then \(v_{out} \ge (v_{out}'\).
When casting from integral to float types, depending on available precision, generally \(v_{in} \neq v_{out}\).
The input is rounded to nearest even when casting from float to integral types.
- Template Parameters
input_t – input dtype
output_t – output dtype
- Parameters
vin – input value to cast
vout – pointer to output value