Index Translation¶
-
template<typename
scalar_t>
boolaugpy::translate_idx_strided(scalar_t *&t, const ndim_array strides, const ndim_array contiguous_strides, const int ndim, const size_t count, const unsigned int values_per_thread, size_t &shape0)¶ Translate from contiguous index to a single strided tensor.
- Parameters
t: pointer to tensor datastrides: strides in number of elements of the strided tensorcontiguous_strides: strides of the contiguous tensor in bytesndim: number of dimensionscount: number of elements in the tensor without the first dimension, i.e., \(\frac{numel(t)}{s_0}\)values_per_thread: number of values calculated by each threadshape0: size of the first dimension; looped over at most values_per_thread times
-
template<typename
scalar1_t, typenamescalar2_t>
boolaugpy::translate_idx_contiguous_strided(scalar1_t *&t1, const ndim_array t1_strides, scalar2_t *&t2, const ndim_array t2_strides, const int ndim, const size_t count, const unsigned int values_per_thread, size_t &shape0)¶ Translate from one contiguous to one strided tensor.
t1must be contiguous,t2may be strided.- Parameters
t1: pointer to contiguous tensor datat1_strides: strides in number of elements oft1t2: pointer to strided tensor datat2_strides: strides in number of elements oft2ndim: number of dimensionscount: number of elements in the tensor without the first dimension, i.e., \(\frac{numel(t)}{s_0}\)values_per_thread: number of values calculated by each threadshape0: size of the first dimension; looped over at most values_per_thread times
-
template<typename
scalar1_t, typenamescalar2_t>
boolaugpy::translate_idx_strided_strided(scalar1_t *&t1, const ndim_array t1_strides, scalar2_t *&t2, const ndim_array t2_strides, const ndim_array contiguous_strides, const int ndim, const size_t count, const unsigned int values_per_thread, size_t &shape0)¶ Translate contiguous index to two strided tensors.
t1andt2may be strided.- Parameters
t1: pointer to first tensor datat1_strides: strides in number of elements oft1t2: pointer to second tensor datat2_strides: strides in number of elements oft2contiguous_strides: strides of the contiguous tensor in bytesndim: number of dimensionscount: number of elements in the tensor without the first dimension, i.e., \(\frac{numel(t)}{s_0}\)values_per_thread: number of values calculated by each threadshape0: size of the first dimension; looped over at most values_per_thread times
-
template<typename
scalar1_t, typenamescalar2_t, typenamescalar3_t>
boolaugpy::translate_idx_strided_strided_strided(scalar1_t *&t1, const ndim_array t1_strides, scalar2_t *&t2, const ndim_array t2_strides, scalar3_t *&t3, const ndim_array t3_strides, const ndim_array contiguous_strides, const int ndim, const size_t count, const unsigned int values_per_thread, size_t &shape0)¶ Translate contiguous index to three strided tensors.
t1,t2, andt3may be strided.- Parameters
t1: pointer to first tensor datat1_strides: strides in number of elements oft1t2: pointer to second tensor datat2_strides: strides in number of elements oft2t3: pointer to third tensor datat3_strides: strides in number of elements oft3contiguous_strides: strides of the contiguous tensor in bytesndim: number of dimensionscount: number of elements in the tensor without the first dimension, i.e., \(\frac{numel(t)}{s_0}\)values_per_thread: number of values calculated by each threadshape0: size of the first dimension; looped over at most values_per_thread times
-
THREAD_LOOP_1(FUN, COUNTER, P1, STRIDE1)¶ Loop one strided tensor over first dimension.
- Parameters
FUN: code to executeCOUNTER: counter variable, initially set to number of iterationsP1: pointer variableSTRIDE1: stride in number of elements in the first dimension
-
THREAD_LOOP_2(FUN, COUNTER, P1, STRIDE1, P2, STRIDE2)¶ Loop two strided tensors over first dimension.
- Parameters
FUN: code to executeCOUNTER: counter variable, initially set to number of iterationsP1: first pointer variableSTRIDE1: first tensor stride in number of elements in the first dimensionP2: second pointer variableSTRIDE2: second tensor stride in number of elements in the first dimension
-
THREAD_LOOP_3(FUN, COUNTER, P1, STRIDE1, P2, STRIDE2, P3, STRIDE3)¶ Loop three strided tensors over first dimension.
- Parameters
FUN: code to executeCOUNTER: counter variable, initially set to number of iterationsP1: first pointer variableSTRIDE1: first tensor stride in number of elements in the first dimensionP2: second pointer variableSTRIDE2: second tensor stride in number of elements in the first dimensionP3: third pointer variableSTRIDE3: third tensor stride in number of elements in the first dimension