Public / onnxruntime / feabafe58bd

Commits

pengwa authored and GitHub committed feabafe58bd22 Jul 2022

Fix memory consumption discrepancy (#12266)

* release cached cuda memory after temp model_copy run

* op schema change only: remove PythonOp forward output from PythonOpGrad inputs.

* always export model using torch.no_grad

* 1.update PythonOP's "input_requires_grads" attribute according to ORT gradient graph.
2. remove PythonOp's "output_tensor_requires_grads" attribute because in torch.no_grad mode, the exported value is not correct.
3. [related to 2] remove PythonOPGrad's "input_tensor_requires_grads" because it comes from corresponding PythonOP's "output_tensor_requires_grads".

* fix uts

* refine basde on wschin's comments && fix pylint

* fix comments

* fix unused variable