Public / onnxruntime / e5ee0b435db

Commits

Tianlei Wu authored and GitHub committed e5ee0b435db10 Sep 2021

Attention Fusion for GPT-2 from Megatron (#8987)

(1) Attention Fusion for gpt-2 model from Megatron.
(2) Update symbolic shape inference of Attention to support 4D mask.
(3) Add an otpion in save_model_to_file to save external data in one file or not, and warning of existing external data
(4) Fix deprecation: logger.warn => logger.warning
(5) Add model loader to test model without external data
(6) Add an API of optimize_by_fusion, and topological sort after optimization.