Commits


Baiju Meswani authored and GitHub committed 1b58331fb34
[QAT] Graph transformer to fuse QDQ pattern into FakeQuant (#13777) To perform QAT in onnxruntime, `FakeQuant` op was introduced in #13649. The onnxruntime quantization tool generates a post training static quantization onnx model with `QuantizeLinear`->`DequantizeLinear` nodes. To perform QAT, this pattern needs to be transformed to `FakeQuant`. This pull request introduces a graph transformer that looks for the `Q->DQ` pattern and fuses it to a `FakeQuant` node.