Commits


PeixuanZuo authored and GitHub committed 28f470c26c2
[ROCm] Use SkipLayerNorm original implementation in kernel explorer (#13382) ### Description <!-- Describe your changes. --> Wrap SkipLayerNormoriginal implementation as a function. Use it as part of SkipLayerNormTunableOp. Use it in Kernel explorer to compare the gap between TunableOp and Original implementation. the profile output like below: `float16 8 512 768 <class '_kernel_explorer.SkipLayerNorm_half_Original'> 23.48 us 804.04 GB/s float16 8 512 768 <class '_kernel_explorer.SkipLayerNorm_half_Tunable'> 20.41 us 925.00 GB/s ...` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>