Public / onnxruntime / 4e670f7ab1b

Skip to sidebar navigation
Skip to content

Commits

Ye Wang authored and GitHub committed 4e670f7ab1b16 Mar 2021

Support larger hidden size in Attention Cuda kernel (#7002)

* Support larger hidden size in Attention Cuda kernel

* Update attention_transpose.cu

* review comments

* fix typo and add check in quantization

* update readme