Commits


mguynn-intc authored and GitHub committed d5f6343a4af
Implementation of AVX-VNNI-INT8 dot product instructions into MLAS GEMM (#21984) ### Description <!-- Describe your changes. --> ONNXRuntime implementation of S8S8 was using the default C++ implementation; with this new ISA, all variants of QGemm Int8 can support VNNI dot product and full AVX2 instructions. All signed/unsigned variants support VNNI instructions starting with LNL. Renamed structs and functions to better indicate support of all Int8 vs U8X8 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> LNL HW implemented new ISA, and this code enables that ISA in QGemm. Speed is improved for S8S8 to match with existing U8S8 code. S8U8 would also match speed if ONNX formally accepted the data type.