Commits


spampana95 authored and GitHub committed 9c0c49900ef
Perform QlinearConv for a batch in a single parallel (#14296) ### Description This code change allows for the QlinearConv operator to sync batches into a single parallel section. This allows for the tasks of all the batches to be made available for threads to exercise. This would act alternatively to the existing method which parallelizes the tasks of induvial images separately which forces threads to wait for all an entire image’s tasks to complete before continuing. ### Motivation and Context For int8 convolution models where multiple batches are being utilized, this patch delivers an inference improvement of up-to 41% and 39% for Mobilenet_edtpu (U8S8) and Resnet50(U8S8) respectively on systems with higher core counts. The patch, delivers the highest benefit on systems with higher thread counts and when utilizing large batch sizes. <html> <body> <!--StartFragment--><span style="color: rgb(201, 209, 217); font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Noto Sans", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(13, 17, 23); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;"><style> </style></span> | | Batch 2 | Batch 4 | Batch 8 | Batch 16 | Batch 32 | Batch 64 -- | -- | -- | -- | -- | -- | -- | -- resnet50 | % Gain | 22% | 25% | 32% | 36% | 33% | 32% <!--EndFragment--> </body> </html>