Commits


Patrice Vignola authored and GitHub committed 49512e558a9
[DML EP] Add I/O binding and `If` operator (#16859) Being able to leverage I/O binding for DML and registering `If` for the DML EP allows us to avoid copying the past/present key/values back and forth between the CPU and the GPU after every token. This gives us a 25% performance increase for Dolly V2 with 128 tokens on an RTX 4090.