Commits


Changming Sun authored and GitHub committed d19e5c0abbd
Fix a misaligned error in CUDA GEMM (#16130) ### Description Fix an issue that FusedMatMulOpTest.FloatTypeTransposeBatch fails to run on GPUs with TF32 support. Authored-by: Tianlei Wu <tlwu@microsoft.com>