Commits


Vincent Wang authored and GitHub committed 25e537770fc
[CUDA] Fix Alignment of SkipLayerNorm Vectorized Kernel (#15054) Some of our vectorized kernels (including SkipLayerNorm) doesn't check the alignment of data pointer. While ORT's allocator may guarantee the alignment, but since training is using PyTorch's allocator, which cannot guarantee that, we need to add the data pointer check before we call any vectorized kernel. This PR is to fix such data pointer alignment issue for SkipLayerNorm's vectorized kernel. We found this issue when running huggingface's swinv2 model. The PR also refactored the code for SkipLayerNorm kernel to make it simpler.