Commits


pengwa authored and GitHub committed feabafe58bd
Fix memory consumption discrepancy (#12266) * release cached cuda memory after temp model_copy run * op schema change only: remove PythonOp forward output from PythonOpGrad inputs. * always export model using torch.no_grad * 1.update PythonOP's "input_requires_grads" attribute according to ORT gradient graph. 2. remove PythonOp's "output_tensor_requires_grads" attribute because in torch.no_grad mode, the exported value is not correct. 3. [related to 2] remove PythonOPGrad's "input_tensor_requires_grads" because it comes from corresponding PythonOP's "output_tensor_requires_grads". * fix uts * refine basde on wschin's comments && fix pylint * fix comments * fix unused variable