Skip to content

Topic/migrate2v2#155

Open
bosilca wants to merge 2 commits into
ICLDisco:masterfrom
bosilca:topic/migrate2v2
Open

Topic/migrate2v2#155
bosilca wants to merge 2 commits into
ICLDisco:masterfrom
bosilca:topic/migrate2v2

Conversation

@bosilca

@bosilca bosilca commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR extends DPLASMA's CUDA/HIP support and moves the GPU-enabled PTG
kernels to the common handle-key style. CUDA and HIP bodies now retrieve
their BLAS handles from per-stream runtime info keys, and the wrappers
initialize those keys consistently.

The main functional additions are:

  • HIP support for GEMM, TRSM, TRMM, POTRF, POINV, GEQRF, and GETRF-no-pivot GPU paths.
  • A HIP twin for the GEQRF ztsmqr helper.
  • A CUBLAS v2 refresh of the GEQRF CUDA ztsmqr helper.
  • Generated HIP helper-core build integration.
  • Best-effort HIP support in zgetrf_nopiv for the existing GPU trailing-update GEMM.

GPU Coverage

Family JDFs CUDA HIP ckey/hkey in JDF wrapper init Notes
zgemm_{NN,NT,TN,TT} 4 1 each 1 each yes yes Full new style
zgemm_*_summa 4 1 each 1 each yes yes Full new style
zgemm_NN_gpu 1 1 1 yes yes Full new style
ztrsm_* 8 2 each 2 each yes yes TRSM + update GEMM chores covered
ztrmm_* 8 1 each 1 each yes yes Full new style
zpotrf_{L,U} 2 4 each 4 each yes yes Also has workspace keys
zpoinv_{L,U} 2 10 each 10 each yes yes Full CUDA/HIP coverage
zgeqrf 1 1 1 yes yes CUDA helper handle-based, HIP helper twin added
zgetrf_nopiv 1 1 1 yes yes Best effort: trailing update GEMM on GPU, panel/TRSM remain CPU fallback

Validation

Tested in the debug build:

cmake --build ../build/debug --target dplasma -j 8
ctest --test-dir ../build/debug -R "dplasma_dgeqrf" --output-on-failure
ctest --test-dir ../build/debug -R "dplasma_dgetrf_nopiv" --output-on-failure

bosilca added 2 commits May 20, 2026 12:53
Signed-off-by: George Bosilca <[email protected]>
Move GPU-enabled PTG kernels toward the common per-stream handle-key
style. Add CUDA/HIP handle globals to the JDFs and initialize them in
the corresponding wrappers.

Add HIP bodies for the GEMM, TRSM, TRMM, POTRF, POINV, GEQRF, and
GETRF-no-pivot GPU paths that currently have CUDA coverage. Modernize
the GEQRF TSMQR CUDA helper to use CUBLAS v2 handles and add the
matching HIPBLAS helper twin. Wire the generated HIP helper core into
the cores build.

For zgetrf_nopiv, add best-effort HIP support for the existing GPU
trailing-update GEMM while leaving panel factorization and TRSM tasks on
the CPU fallback path.

Signed-off-by: George Bosilca <[email protected]>
@bosilca bosilca requested a review from a team as a code owner May 20, 2026 16:54
@bosilca

bosilca commented May 20, 2026

Copy link
Copy Markdown
Contributor Author

At this point it is normal for this to fail to build, it needs #156 to be merged first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant