Commits


Chen Fu authored and GitHub committed 1c84621020f
Adding ARM64 depthwise convolution kernel for symmetric quantization (#9655) Adding ARM64 depthwise convolution kernel for symmetric quantization Motivation and Context Two improvements against current kernel code : 1. Signed int8 based instructions, no need to extend from 8b to 16b before multiplication. 2. Unrolled loop with manual software pipelining Co-authored-by: Chen Fu <fuchen@microsoft.com>