Commits


Patrice Vignola authored and GitHub committed 0ff915eba83
[DML EP] Add frequent upload heap flushing (#15960) This reduces peak nonlocal memory consumption when uploading large weights for big models (e.g. LLMs), while at the same time trying to keep the GPU as busy as possible. This change could be more sophisticated, but at this stage it is the most minimal and least risky change required to support LLMs.