|
Intrepid2
|
Implementation of a general sum factorization algorithm, abstracted from the algorithm described by Mora and Demkowicz, for integration. Uses hierarchical parallelism. More...
#include <Intrepid2_IntegrationToolsDef.hpp>
Public Member Functions | |
| F_Integrate (Data< Scalar, DeviceType > integralData, TensorData< Scalar, DeviceType > leftComponent, Data< Scalar, DeviceType > composedTransform, TensorData< Scalar, DeviceType > rightComponent, TensorData< Scalar, DeviceType > cellMeasures, int a_offset, int b_offset, int leftFieldOrdinalOffset, int rightFieldOrdinalOffset, bool forceNonSpecialized) | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
| KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
| runtime-sized variant of incrementArgument; gets used by approximate flop count. | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
| KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
| runtime-sized variant of nextIncrementResult; gets used by approximate flop count. | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | relativeEnumerationIndex (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds, const int startIndex) const |
| KOKKOS_INLINE_FUNCTION void | runSpecialized3 (const TeamMember &teamMember) const |
| runSpecialized implementations are hand-coded variants of run() for a particular number of components. To allow comparisons with the generic implementation (both in terms of performance and for verification), we use the member variable forceNonSpecialized_ to determine whether runSpecialized is selected when a specialized implementation is available. | |
| template<size_t numTensorComponents> | |
| KOKKOS_INLINE_FUNCTION void | run (const TeamMember &teamMember) const |
| KOKKOS_INLINE_FUNCTION void | operator() (const TeamMember &teamMember) const |
| long | approximateFlopCountPerCell () const |
| returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation). | |
| int | teamSize (const int &maxTeamSizeFromKokkos) const |
| returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available. | |
| size_t | team_shmem_size (int team_size) const |
| Provide the shared memory capacity. | |
Private Types | |
| using | ExecutionSpace = typename DeviceType::execution_space |
| using | TeamPolicy = Kokkos::TeamPolicy< ExecutionSpace > |
| using | TeamMember = typename TeamPolicy::member_type |
| using | IntegralViewType = Kokkos::View< typename RankExpander< Scalar, integralViewRank >::value_type, DeviceType > |
Private Attributes | |
| IntegralViewType | integralView_ |
| TensorData< Scalar, DeviceType > | leftComponent_ |
| Data< Scalar, DeviceType > | composedTransform_ |
| TensorData< Scalar, DeviceType > | rightComponent_ |
| TensorData< Scalar, DeviceType > | cellMeasures_ |
| int | a_offset_ |
| int | b_offset_ |
| int | leftComponentSpan_ |
| int | rightComponentSpan_ |
| int | numTensorComponents_ |
| int | leftFieldOrdinalOffset_ |
| int | rightFieldOrdinalOffset_ |
| bool | forceNonSpecialized_ |
| size_t | fad_size_output_ = 0 |
| Kokkos::Array< int, 7 > | offsetsForComponentOrdinal_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | leftFieldBounds_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | rightFieldBounds_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | pointBounds_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | leftFieldRelativeEnumerationSpans_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | rightFieldRelativeEnumerationSpans_ |
| int | maxFieldsLeft_ |
| int | maxFieldsRight_ |
| int | maxPointCount_ |
Implementation of a general sum factorization algorithm, abstracted from the algorithm described by Mora and Demkowicz, for integration. Uses hierarchical parallelism.
Definition at line 32 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 34 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 38 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 36 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 35 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 70 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation).
Definition at line 871 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::getTensorComponent(), and Intrepid2::TensorData< Scalar, DeviceType >::numTensorComponents().
|
inline |
Definition at line 155 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of incrementArgument; gets used by approximate flop count.
Definition at line 178 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 196 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of nextIncrementResult; gets used by approximate flop count.
Definition at line 217 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 847 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 233 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 570 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runSpecialized implementations are hand-coded variants of run() for a particular number of components. To allow comparisons with the generic implementation (both in terms of performance and for verification), we use the member variable forceNonSpecialized_ to determine whether runSpecialized is selected when a specialized implementation is available.
Definition at line 266 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::getTensorComponent(), Intrepid2::Data< DataScalar, DeviceType >::getUnderlyingView2(), Intrepid2::Data< DataScalar, DeviceType >::getUnderlyingView4(), Intrepid2::Data< DataScalar, DeviceType >::rank(), and Intrepid2::Data< DataScalar, DeviceType >::underlyingMatchesLogical().
|
inline |
Provide the shared memory capacity.
Definition at line 958 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int().
|
inline |
returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available.
Definition at line 950 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 44 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 45 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 43 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 41 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 53 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 51 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 39 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 40 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 46 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 59 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 49 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 63 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 66 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 67 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 68 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 48 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 55 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 61 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 42 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 47 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 60 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 50 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 64 of file Intrepid2_IntegrationToolsDef.hpp.