Public / onnxruntime / 6efa9d9e106

Skip to sidebar navigation
Skip to content

Commits

Zhang Lei authored and GitHub committed 6efa9d9e10624 Sep 2022

Add more qordered int8 operators for CUDA provider (#12949)

Attention, Quantize/Dequantize etc.
Update QOrderedMatmul's schema, updated unittest.
Verified test data for QOrdered Attention.

Co-authored-by: Zhang Lei <phill.zhang@gmail.com>
Co-authored-by: Lei Zhang <zhalei@microsoft.com>