Public / onnxruntime / 9810b9e02b5

Skip to sidebar navigation
Skip to content

Commits

Edward Chen authored and GitHub committed 9810b9e02b515 Dec 2020

Reduce amount of compiled CUDA device code (#6118)

Move CudaKernel from cuda_common.h to a new separate header, cuda_kernel.h. Update include sites to use cuda_kernel.h instead if they need CudaKernel. Inclusions of cuda_common.h are now more lightweight.

Make corresponding changes for ROCM execution provider code.

Other minor cleanup.