Commits


pengwa authored and GitHub committed 89ef987ab1c
Improve NonZero on CUDA/ROCM (#10307) * improve NonZero * fix megatron_fp16 optimzier, fix the doc * multi_tensor_applier * resolve comment * fix building warning * fix build error when enabling training and use tensorrt