Learn more about cloning repositories
You have read-only access
[CUDA] Add PackedMultiHeadAttention operator (#16779) ### Description Add new operator for MultiHeadAttention with inputs removed padding. This only supports packed QKV format.