Commits


Abhishek Udupa authored and GitHub committed 7d684d12555
Include algorithm selection exposed by ROCBLAS extensions API in GEMM autotuning (#13831) ### Description Extend GEMM autotuning by including algorithms exposed by a ROCBLAS extension API. ### Motivation and Context Based on our request, the ROCm team has implemented extension APIs in ROCBLAS that provides a list of application GEMM algorithms/implementations for a given input size, along with an API that actually performs the GEMM using the specified implementation/algorithm. We have observed that the ROCBLAS algorithm/implementation selection logic does not always pick the optimal. This PR uses the extension APIs to integrate the exposed ROCBLAS algorithms/implementations into the autotuning framework. The feature is disabled by default (the ROCBlas extension APIs are slated to be released with ROCm 5.5, and are not yet generally available). To enable: build with `--cmake-extra-defines USE_ROCBLAS_EXTENSION_API=1 CMAKE_HIP_FLAGS=-DUSE_ROCBLAS_EXTENSION_API` and then enable tuning in the provider options. Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>