Commits


cloudhan authored and GitHub committed a997bb46b6a
Refactor rocm attention (#14688) Extract QKV projection and attention computation into pipelines (composed from gemms and kernel launch). This will allow us to introduce ck flash attention in next PR