Commits


kunal-vaishnavi authored and GitHub committed cb69c598633
Add fusions for SigLIP and Conformer-Encoder (#23528) ### Description This PR adds fusions for [Google's SigLIP model](https://huggingface.co/google/siglip-base-patch16-224/) and Microsoft's internal conformer-encoder model. Here is an example of how to run the ORT transformer optimizer for the SigLIP model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type clip --num_heads 16 --hidden_size 1152 --use_external_data_format --opt_level 0 --disable_shape_inference ``` Here is an example of how to run the ORT transformer optimizer for the conformer-encoder model. ``` $ git clone https://github.com/microsoft/onnxruntime $ cd onnxruntime/onnxruntime/python/tools/transformers $ python3 optimizer.py --input /path/to/model.onnx --output /path/to/model_opt.onnx --model_type conformer --num_heads 16 --hidden_size 1024 --use_external_data_format --opt_level 0 --disable_shape_inference --convert_attribute ``` ### Motivation and Context This PR helps optimize multi-modal models that use SigLIP for the vision encoder and conformer-encoder for the speech encoder. This PR uses changes from the following PRs: - https://github.com/pytorch/pytorch/pull/144801 - https://github.com/microsoft/onnxscript/pull/2018 - https://github.com/microsoft/onnxscript/pull/2019 - https://github.com/microsoft/onnxscript/pull/2020 - https://github.com/microsoft/onnxscript/pull/2021 - https://github.com/microsoft/onnxscript/pull/2022 - https://github.com/microsoft/onnxscript/pull/2024 - https://github.com/microsoft/onnxscript/pull/2025 - https://github.com/microsoft/onnxscript/pull/2029 - https://github.com/microsoft/onnxscript/pull/2033 ### Introduction of ONNX Script This PR introduces [ONNX Script](https://github.com/microsoft/onnxscript) into the ORT transformer optimizer as an optional step via the `fold_transpose_initializers()` method of the `DynamoOnnxHelper` class.