Commits


Ye Wang authored and GitHub committed cc3faba6161
Support seq_len > 64K in rotary embedding cuda kernel (#20204) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->