Commits


Tianlei Wu authored and GitHub committed 7b39f5090ce
Add Attention op for multi-head self attention in BERT (#1984) * Add Attention op for multi head self attention in BERT * Add test cases * Move op from kOnnxDomain to kMSDomain. Limit test to run by CUDA provider only. * fix test * Add float16 test * fix cpu build error * handle cuda error * get last cuda error when failed