Commits


Edward Chen authored and GitHub committed 9810b9e02b5
Reduce amount of compiled CUDA device code (#6118) Move CudaKernel from cuda_common.h to a new separate header, cuda_kernel.h. Update include sites to use cuda_kernel.h instead if they need CudaKernel. Inclusions of cuda_common.h are now more lightweight. Make corresponding changes for ROCM execution provider code. Other minor cleanup.