Public / onnxruntime / 068b568472b

Commits

KeDengMS authored and GitHub committed 068b568472b29 Aug 2019

Add support for int8 x uint8 for MatMulInteger, and int16 x int16 custom op (#1391)

Description: The change adds necessary quantization support on CPU with mixed int8/uint8, as well as int16 for matrix multiply operations that outputs int32

Motivation and Context

Integer operations are critical for quantized model's performance
Current MatMulInteger implementation in CPU only supports uint8 x uint8, while the spec supports int8 x uint8. Having a default CPU implementation that fully support the spec would help accuracy verification.
Besides, some model may need to quantize to int16, but MatMulInteger op does not support that yet. A custom op of MatMulInteger16 is added to satisfy such models.