Commits


Ye Wang authored and GitHub committed 3418ca28a86
pack qkv in t5 decoder (#15801) ### Description <!-- Describe your changes. --> V100, b_4_s_128, max_output_len=64, beam=4 before: t5_small: 101.28ms t5_base: 200.07ms after: t5_small: 87.65ms t5_base: 174.44ms ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>