See this page for a list of current compilers (including CUDA): https://github.com/trilinos/Trilinos/wiki/Pull-Request-Testing-Interface

Building Trilinos with CUDA support requires a script called nvcc_wrapper, which is distributed inside Kokkos within Trilinos. Enabling both CUDA and MPI using OpenMPI can be done by setting these environment variables:

export OMPI_CXX=/<Tpath>/Trilinos/Trilinos/packages/kokkos/config/nvcc_wrapper

where Tpath is the path at which a copy of Trilinos is available.

This variable tells mpicxx to use nvcc_wrapper as the underlying compiler. Note that nvcc_wrapper uses g++ as the default C++ host compiler.

Below is a CMake configure script fragment to then configure Trilinos:

 -DCMAKE_CXX_COMPILER=/<Mpath>/bin/mpicxx \
 -DCMAKE_C_COMPILER=/<Mpath>/bin/mpicc \
 -DCMAKE_Fortran_COMPILER=/<Mpath>/bin/mpif77 \
 -DCMAKE_CXX_FLAGS="-g -lineinfo -Xcudafe \
--diag_suppress=conversion_function_not_usable -Xcudafe \
--diag_suppress=cc_clobber_ignored -Xcudafe \
--diag_suppress=code_is_unreachable" \
 -DTPL_ENABLE_MPI=ON \
 -DTPL_ENABLE_CUDA=ON \
 -DKokkos_ENABLE_CUDA=ON \

where Mpath is the path to the base of the OpenMPI installation to use for the build.

The CMAKE_CXX_FLAGS line adds some nvcc_wrapper command-line arguments to disable some superfluous warnings generated by nvcc.