Commits


Abhishek Udupa authored and GitHub committed 9954454c650
Make the ROCM profiler thread-safe, session-aware and preserve logical ordering between CPU and GPU events (#13549) ### Description The existing ROCM profiler has a few shortcomings, which this PR fixes. ### Motivation and Context The existing ROCM profiler: 1. Is not thread-safe 2. Is not session-aware: i.e., if multiple inference sessions enable profiling, then events (esp GPU events) get mixed up between the sessions 3. Has some issues with respect to coding standards. This PR addresses all of the above by cleanly re-implementing parts of the ROCM profiler as required. Attached are 4 profile outputs from a multi-session run of the StableDiffusion model, as well as a quick-and-dirty script that checks the profile outputs for the invariants claimed. [sd_profile_outputs.tar.gz](https://github.com/microsoft/onnxruntime/files/9924608/sd_profile_outputs.tar.gz) [check_profile_output_wellformedness.zip](https://github.com/microsoft/onnxruntime/files/9924614/check_profile_output_wellformedness.zip) Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>