Public / onnxruntime / 84589c7e052

Commits

Suffian Khan authored and GitHub committed 84589c7e05219 Sep 2020

Fuse softmax(a + b) in case of simple broadcast (#4937)

* bias softmax kernel

* bias softmax kernel

* remove debug comments

* remove debug comment

* windows build doesnt handle unary minus on unsigned type

* int64 => int treated as error

* only support cuda

* add bias softmax fusion tests

* PR comments

* more PR comments

* use MLTypeCallDispatcher

* break function into pieces

* add loop unroll and add to list for inference as well

* use std::min and move operator==

* revert std::min (doesnt work ci pipeline) and fix int to size_t error

* pr comments

* fixes for windows ci

* fix for windows ci

* pr comments on consistency

* p_model_

* fix formatting and add anonymous namespace

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>