Commits


Sunghoon authored and GitHub committed fda0aa14c8b
SkipLayerNorm fusion with different input and output type (#15500) SkipLayerNorm fusion fuses LayerNorm and one or more Add kernels now. While LayerNormalization kernel allows different input and output type by definition, SkipLayerNormalization must have the same input and output type. This graph is valid as the output of Add node is float16 and two inputs from initializers are float.  But, when Add and LayerNormalization are fused, it fails because two inputs of Add node are float16 type and SkipLayerNormalization must have the same input types. To avoid this failure, this PR adds Cast node before inputs of SkipLayerNormalization when input and output type are different and output type is float. The above graph is fused as follows,  For performance, it'd better for SkipLayerNormalization to support different input and output type, but this PR is to unblock Turing NLR v5 base mode in Babel. When we have more cases, we can support it.